[jira] [Commented] (HDFS-13739) Add option to disable rack local write preference

Hudson (Jira) Tue, 18 Feb 2020 19:29:00 -0800


    [ 
https://issues.apache.org/jira/browse/HDFS-13739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17039675#comment-17039675
 ]


Hudson commented on HDFS-13739:
-------------------------------

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17964 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17964/])
HDFS-13739. Add option to disable rack local write preference. (ayushsaxena: 
rev ac4b556e2d44d3cd10b81c190ecee23e2dd66c10)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/AddBlockFlag.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDefaultBlockPlacementPolicy.java
* (edit) 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CreateFlag.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDistributedFileSystem.java


> Add option to disable rack local write preference
> -------------------------------------------------
>
>                 Key: HDFS-13739
>                 URL: https://issues.apache.org/jira/browse/HDFS-13739
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: balancer &amp; mover, block placement, datanode, fs, 
> hdfs, hdfs-client, namenode, nn, performance
>    Affects Versions: 2.7.3
>         Environment: Hortonworks HDP 2.6
>            Reporter: Hari Sekhon
>            Assignee: Ayush Saxena
>            Priority: Major
>             Fix For: 3.3.0
>
>         Attachments: HDFS-13739-01.patch
>
>
> Request to be able to disable Rack Local Write preference / Write All 
> Replicas to different Racks.
> Current HDFS write pattern of "local node, rack local node, other rack node" 
> is good for most purposes but there are at least 2 scenarios where this is 
> not ideal:
>  # Rack-by-Rack Maintenance leaves data at risk of losing last remaining 
> replica. If a single datanode failed it would likely cause some data outage 
> or even data loss if the rack is lost or an upgrade fails (or perhaps it's a 
> rack rebuild). Setting replicas to 4 would reduce write performance and waste 
> storage which is currently the only workaround to that issue.
>  # Major Storage Imbalance across datanodes when there is an uneven layout of 
> datanodes across racks - some nodes fill up while others are half empty.
> I have observed this storage imbalance on a cluster where half the nodes were 
> 85% full and the other half were only 50% full.
> Rack layouts like the following illustrate this - the nodes in the same rack 
> will only choose to send half their block replicas to each other, so they 
> will fill up first, while other nodes will receive far fewer replica blocks:
> {code:java}
> NumNodes - Rack 
> 2 - rack 1
> 2 - rack 2
> 1 - rack 3
> 1 - rack 4 
> 1 - rack 5
> 1 - rack 6{code}
> In this case if I reduce the number of replicas to 2 then I get an almost 
> perfect spread of blocks across all datanodes because HDFS has no choice but 
> to maintain the only 2nd replica on a different rack. If I increase the 
> replicas back to 3 it goes back to 85% on half the nodes and 50% on the other 
> half, because the extra replicas choose to replicate only to rack local nodes.
> Why not just run the HDFS balancer to fix it you might say? This is a heavily 
> loaded HBase cluster - aside from destroying HBase's data locality and 
> performance by moving blocks out from underneath RegionServers - as soon as 
> an HBase major compaction occurs (at least weekly), all blocks will get 
> re-written by HBase and the HDFS client will again write to local node, rack 
> local node, other rack node - resulting in the same storage imbalance again. 
> Hence this cannot be solved by running HDFS balancer on HBase clusters - or 
> for any application sitting on top of HDFS that has any HDFS block churn.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HDFS-13739) Add option to disable rack local write preference

Reply via email to