[ 
https://issues.apache.org/jira/browse/HDFS-5579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhaoyunjiong updated HDFS-5579:
-------------------------------

    Attachment: HDFS-5579.patch
                HDFS-5579-branch-1.2.patch

This patch let NameNode can replicate blocks belongs to under construction 
files but not the last block.
And if the decommissioning DataNodes only have some blocks which are the last 
blocks of under construction files and have more than 1 live replicates left 
behind, then NameNode could set it to DECOMMISSIONED.

> Under construction files make DataNode decommission take very long hours
> ------------------------------------------------------------------------
>
>                 Key: HDFS-5579
>                 URL: https://issues.apache.org/jira/browse/HDFS-5579
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 1.2.0, 2.2.0
>            Reporter: zhaoyunjiong
>            Assignee: zhaoyunjiong
>         Attachments: HDFS-5579-branch-1.2.patch, HDFS-5579.patch
>
>
> We noticed that some times decommission DataNodes takes very long time, even 
> exceeds 100 hours.
> After check the code, I found that in 
> BlockManager:computeReplicationWorkForBlocks(List<List<Block>> 
> blocksToReplicate) it won't replicate blocks which belongs to under 
> construction files, however in 
> BlockManager:isReplicationInProgress(DatanodeDescriptor srcNode), if there  
> is block need replicate no matter whether it belongs to under construction or 
> not, the decommission progress will continue running.
> That's the reason some time the decommission takes very long time.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to