[ https://issues.apache.org/jira/browse/HDFS-7374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14202348#comment-14202348 ]
Zhe Zhang commented on HDFS-7374: --------------------------------- [~mingma] Thanks much for clarifying the state machine. I agree my option #2 is cleaner and makes the decommissioning of dead nodes much faster. I'll go ahead with that approach now. bq. If the node stays in Dead, DECOMMISSION_INPROGRESS for too long, have the higher layer application remove the node from exclude file and thus abort the decommission process. This will transition the node to Dead, NORMAL. The specific higher layer application in my case is Cloudera Manager and I think it's possible to add this logic. However I don't know how easy it is to change all similar management applications. bq. HDFS-6791 mentioned another way to address the original issue. When nodes become dead, mark them DECOMMISSIONED and fix the replication to handle this case. In other words, get rid of Dead, DECOMMISSION_INPROGRESS state. Do you mean allowing a {{DECOMMISSIONED}} node to be the source of a replica transfer? It seems a little fragile to me; intuitively, it could surprise upper layer applications that a {{DECOMMISSIONED}} node is still actively transferring data. But I would like to hear the opinions from other people. > Allow decommissioning of dead DataNodes > --------------------------------------- > > Key: HDFS-7374 > URL: https://issues.apache.org/jira/browse/HDFS-7374 > Project: Hadoop HDFS > Issue Type: Bug > Reporter: Zhe Zhang > Assignee: Zhe Zhang > > We have seen the use case of decommissioning DataNodes that are already dead > or unresponsive, and not expected to rejoin the cluster. > The logic introduced by HDFS-6791 will mark those nodes as > {{DECOMMISSION_INPROGRESS}}, with a hope that they can come back and finish > the decommission work. If an upper layer application is monitoring the > decommissioning progress, it will hang forever. -- This message was sent by Atlassian JIRA (v6.3.4#6332)