[
https://issues.apache.org/jira/browse/HDFS-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Advertising
BELUGA BEHR updated HDFS-13448:
-------------------------------
Attachment: HDFS-13448.1.patch
> HDFS Block Placement - Ignore Locality
> --------------------------------------
>
> Key: HDFS-13448
> URL: https://issues.apache.org/jira/browse/HDFS-13448
> Project: Hadoop HDFS
> Issue Type: New Feature
> Components: block placement, hdfs-client
> Affects Versions: 2.9.0, 3.0.1
> Reporter: BELUGA BEHR
> Priority: Minor
> Attachments: HDFS-13448.1.patch
>
>
> According to the HDFS Block Place Rules:
> {quote}
> /**
> * The replica placement strategy is that if the writer is on a datanode,
> * the 1st replica is placed on the local machine,
> * otherwise a random datanode. The 2nd replica is placed on a datanode
> * that is on a different rack. The 3rd replica is placed on a datanode
> * which is on a different node of the rack as the second replica.
> */
> {quote}
> However, there is a hint for the hdfs-client that allows the block placement
> request to not put a block replica on the local datanode _where 'local' means
> the same host as the client is being run on._
> {quote}
> /**
> * Advise that a block replica NOT be written to the local DataNode where
> * 'local' means the same host as the client is being run on.
> *
> * @see CreateFlag#NO_LOCAL_WRITE
> */
> {quote}
> I propose that we add a new flag that allows the hdfs-client to request that
> the first block replica be placed on a random DataNode in the cluster. The
> subsequent block replicas should follow the normal block placement rules.
> The issue is that when the {{NO_LOCAL_WRITE}} is enabled, the first block
> replica is not placed on the local node, but it is still placed on the local
> rack. Where this comes into play is where you have, for example, a flume
> agent that is loading data into HDFS.
> If the Flume agent is running on a DataNode, then by default, the DataNode
> local to the Flume agent will always get the first block replica and this
> leads to un-even block placements, with the local node always filling up
> faster than any other node in the cluster.
> Modifying this example, if the DataNode is removed from the host where the
> Flume agent is running, or this {{NO_LOCAL_WRITE}} is enabled by Flume, then
> the default block placement policy will still prefer the local rack. This
> remedies the situation only so far as now the first block replica will always
> be distributed to a DataNode on the local rack.
> This new flag would allow a single Flume agent to distribute the blocks
> randomly, evenly, over the entire cluster instead of hot-spotting the local
> node or the local rack.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org