[ 
https://issues.apache.org/jira/browse/HDFS-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14320169#comment-14320169
 ] 

Frode Halvorsen commented on HDFS-7787:
---------------------------------------

Actually, I had the situation that the node under decommission was told to 
replicate blocks with live copies before blocks without live copies. That was 
my 'problem'.  
The node under decommission reports two numbers;
"number of underreplicated blocks" and "blocks with no live replicas" and my 
experience was that the number of underreplicated blocks was reduced while the 
number of 'no live replicas' remained the same. If I restarted the data-node, 
it started to replicate the most importent blocks again, but after a while, if 
only replicated the underreplicated blocks again.

But yes ; It would also be nice to have a priority for '1 live replicas' over 
'2 live replicas' after all of the '0 live replicas' queue was empty :)  
I have not looked into how to contribute yet, so I won't assign this one to me 
just now :)  I'm just learning to use this in the proper way, and still I think 
I have a few issues in my setup that I need to resolve before starting to code 
:)

> Wrong priorty of replication
> ----------------------------
>
>                 Key: HDFS-7787
>                 URL: https://issues.apache.org/jira/browse/HDFS-7787
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode
>    Affects Versions: 2.6.0
>         Environment: 2 namenodes HA, 6 datanodes in two racks
>            Reporter: Frode Halvorsen
>              Labels: balance, hdfs, replication-performance
>
> Each file has a setting of 3 replicas. split on different racks.
> After a simulated crash of one rack (shutdown of all nodes, deleted 
> data-directory an started nodes) and decommssion of one of the nodes in the 
> orther rack the replication does not follow 'normal' rules...
> My cluster has appx 25 mill files, and the one node I now try to decommision 
> has 9 millions underreplicated blocks, and 3,5 million blocks with 'no live 
> replicas'. After a restart of the node, it starts to replicate both types of 
> blocks, but after a while, it only repliates under-replicated blocks with 
> other live copies. I would think that the 'normal' way to do this would be to 
> make sure that all blocks this node keeps the only copy of, should be the 
> first to be replicated/balanced ?  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to