[jira] [Commented] (HDFS-7787) Split QUEUE_HIGHEST_PRIORITY in UnderReplicatedBlocks to give more priority to blocks on nodes being decomissioned

Frode Halvorsen (JIRA) Sat, 14 Feb 2015 14:06:29 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14321728#comment-14321728
 ]


Frode Halvorsen commented on HDFS-7787:
---------------------------------------

I now changed parameters again in order to speed up replicatuon, and now I see 
that the decommissioning node is told to replicate both to two and three other 
nodes. Actually most of the requests is to replicate only two copies, so I 
suspect that the blocks it's asked to replicate does have live replicas in the 
cluster.
Appx 1/5 of the replication requests is for three nodes:
2015-02-14 23:00:16,008 INFO BlockStateChange: BLOCK* ask x.x.x.208:50010 to 
replicate blk_1097839116_24099285 to datanode(s) x.x.x.206:50010 
x.x.x.207:50010 x.x.x.209:50010
2015-02-14 23:00:16,009 INFO BlockStateChange: BLOCK* ask x.x.x.208:50010 to 
replicate blk_1097839119_24099288 to datanode(s) x.x.x.206:50010 
x.x.x.205:50010 x.x.x.209:50010
2015-02-14 23:00:16,010 INFO BlockStateChange: BLOCK* ask x.x.x.208:50010 to 
replicate blk_1097839113_24099282 to datanode(s) x.x.x.204:50010 x.x.x.205:50010
2015-02-14 23:00:16,010 INFO BlockStateChange: BLOCK* ask x.x.x.208:50010 to 
replicate blk_1097839114_24099283 to datanode(s) x.x.x.204:50010 x.x.x.209:50010
2015-02-14 23:00:16,011 INFO BlockStateChange: BLOCK* ask x.x.x.208:50010 to 
replicate blk_1097839166_24099335 to datanode(s) x.x.x.204:50010 x.x.x.207:50010
2015-02-14 23:00:16,012 INFO BlockStateChange: BLOCK* ask x.x.x.208:50010 to 
replicate blk_1097839162_24099331 to datanode(s) x.x.x.206:50010 
x.x.x.205:50010 x.x.x.209:50010
2015-02-14 23:00:20,046 INFO BlockStateChange: BLOCK* ask x.x.x.208:50010 to 
replicate blk_1097839309_24099478 to datanode(s) x.x.x.206:50010 x.x.x.205:50010
2015-02-14 23:00:20,047 INFO BlockStateChange: BLOCK* ask x.x.x.208:50010 to 
replicate blk_1097839310_24099479 to datanode(s) x.x.x.204:50010 x.x.x.205:50010
2015-02-14 23:00:20,047 INFO BlockStateChange: BLOCK* ask x.x.x.208:50010 to 
replicate blk_1097839352_24099521 to datanode(s) x.x.x.206:50010 x.x.x.209:50010
2015-02-14 23:00:20,048 INFO BlockStateChange: BLOCK* ask x.x.x.208:50010 to 
replicate blk_1097839359_24099528 to datanode(s) x.x.x.206:50010 x.x.x.209:50010
2015-02-14 23:00:20,048 INFO BlockStateChange: BLOCK* ask x.x.x.208:50010 to 
replicate blk_1097839358_24099527 to datanode(s) x.x.x.204:50010 x.x.x.209:50010
2015-02-14 23:00:20,049 INFO BlockStateChange: BLOCK* ask x.x.x.208:50010 to 
replicate blk_1097839357_24099526 to datanode(s) x.x.x.206:50010 x.x.x.207:50010
2015-02-14 23:00:22,056 INFO BlockStateChange: BLOCK* ask x.x.x.208:50010 to 
replicate blk_1097839241_24099410 to datanode(s) x.x.x.206:50010 x.x.x.205:50010
2015-02-14 23:00:22,057 INFO BlockStateChange: BLOCK* ask x.x.x.208:50010 to 
replicate blk_1097839242_24099411 to datanode(s) x.x.x.204:50010 
x.x.x.209:50010 x.x.x.205:50010


The node at 208 is decommissioning and i would say that this proves that the 
node is asked to replicate blocks that have live replicas as well as blocks 
with no live replicas.  I haven't looked at the code, but it's wrong for me to 
have the decommissioning node replicate other blocks than those without live 
replicas.

> Split QUEUE_HIGHEST_PRIORITY in UnderReplicatedBlocks to give more priority 
> to blocks on nodes being decomissioned
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-7787
>                 URL: https://issues.apache.org/jira/browse/HDFS-7787
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode
>    Affects Versions: 2.6.0
>         Environment: 2 namenodes HA, 6 datanodes in two racks
>            Reporter: Frode Halvorsen
>              Labels: balance, hdfs, replication-performance
>
> Each file has a setting of 3 replicas. split on different racks.
> After a simulated crash of one rack (shutdown of all nodes, deleted 
> data-directory an started nodes) and decommssion of one of the nodes in the 
> orther rack the replication does not follow 'normal' rules...
> My cluster has appx 25 mill files, and the one node I now try to decommision 
> has 9 millions underreplicated blocks, and 3,5 million blocks with 'no live 
> replicas'. After a restart of the node, it starts to replicate both types of 
> blocks, but after a while, it only repliates under-replicated blocks with 
> other live copies. I would think that the 'normal' way to do this would be to 
> make sure that all blocks this node keeps the only copy of, should be the 
> first to be replicated/balanced ?  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7787) Split QUEUE_HIGHEST_PRIORITY in UnderReplicatedBlocks to give more priority to blocks on nodes being decomissioned

Reply via email to