[ 
https://issues.apache.org/jira/browse/HDFS-5446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13810911#comment-13810911
 ] 

Andrew Wang commented on HDFS-5446:
-----------------------------------

Very interesting idea, thanks for filing this Nathan. Are you thinking this 
would be a NN-side thing like DN decommissioning? The two are kind of similar; 
decommissioning DNs aren't assigned more blocks to write, and I believe are 
deprioritized for reads as well. For rolling restart, we just wouldn't be 
moving the blocks off. It'd be up to the admin to toggle/untoggle this state as 
the rolling restart progresses.

> Consider supporting a mechanism to allow datanodes to drain outstanding work 
> during rolling upgrade
> ---------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-5446
>                 URL: https://issues.apache.org/jira/browse/HDFS-5446
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode
>    Affects Versions: 2.2.0
>            Reporter: Nathan Roberts
>
> Rebuilding write pipelines is expensive and this can happen many times during 
> a rolling restart of datanodes (i.e. during a rolling upgrade). It seems like 
> it might help if datanodes could be told to drain current work while 
> rejecting new requests - possibly with a new response indicating the node is 
> temporarily unavailable (it's not broken, it's just going through a 
> maintenance phase where it shouldn't accept new work). 
> Waiting just a few seconds is normally enough to clear up a good percentage 
> of the open requests without error, thus reducing the overhead associated 
> with restarting lots of datanodes in rapid succession.
> Obviously would need a timeout to make sure the datanode doesn't wait forever.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to