[ 
https://issues.apache.org/jira/browse/HDFS-5583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13905797#comment-13905797
 ] 

Brandon Li commented on HDFS-5583:
----------------------------------

Some early comments. I haven't finish viewing all the changes.
- In DataNode#shutdownDatanode() can be called only once, and throws exception 
for the next invocations.
I would imagine that after administrator issues "dfsadmin shutdownDatanode 
-upgrade"command, he/she would like to know if the DataNodes received it and if 
they are in upgrade preparation state. Unless I missed something, it seems the 
only way to know it is to issue the same command again and expect to receive an 
exception. Would it be better to either let shutdownDatanode return an error 
code or have getDataNodeInfo include current datanode state?

- Do we plan to have more OOB Ack anytime soon? We can always add new enums 
instead of reserving a few OOB_RESERVEDx for now. 

- In DataNode.java: is "forUpgrade", "upgrade" or "shutdownForUpgrade" a better 
name than the variable name "restarting"? :-)

- DataXceiverServer.java: please clean the unused import


> Make DN send an OOB Ack on shutdown before restaring
> ----------------------------------------------------
>
>                 Key: HDFS-5583
>                 URL: https://issues.apache.org/jira/browse/HDFS-5583
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Kihwal Lee
>            Assignee: Kihwal Lee
>         Attachments: HDFS-5583.patch, HDFS-5583.patch, HDFS-5583.patch
>
>
> Add an ability for data nodes to send an OOB response in order to indicate an 
> upcoming upgrade-restart. Client should ignore the pipeline error from the 
> node for a configured amount of time and try reconstruct the pipeline without 
> excluding the restarted node.  If the node does not come back in time, 
> regular pipeline recovery should happen.
> This feature is useful for the applications with a need to keep blocks local. 
> If the upgrade-restart is fast, the wait is preferable to losing locality.  
> It could also be used in general instead of the draining-writer strategy.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to