[
https://issues.apache.org/jira/browse/HDFS-5446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13810911#comment-13810911
]
Andrew Wang commented on HDFS-5446:
-----------------------------------
Very interesting idea, thanks for filing this Nathan. Are you thinking this
would be a NN-side thing like DN decommissioning? The two are kind of similar;
decommissioning DNs aren't assigned more blocks to write, and I believe are
deprioritized for reads as well. For rolling restart, we just wouldn't be
moving the blocks off. It'd be up to the admin to toggle/untoggle this state as
the rolling restart progresses.
> Consider supporting a mechanism to allow datanodes to drain outstanding work
> during rolling upgrade
> ---------------------------------------------------------------------------------------------------
>
> Key: HDFS-5446
> URL: https://issues.apache.org/jira/browse/HDFS-5446
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: datanode
> Affects Versions: 2.2.0
> Reporter: Nathan Roberts
>
> Rebuilding write pipelines is expensive and this can happen many times during
> a rolling restart of datanodes (i.e. during a rolling upgrade). It seems like
> it might help if datanodes could be told to drain current work while
> rejecting new requests - possibly with a new response indicating the node is
> temporarily unavailable (it's not broken, it's just going through a
> maintenance phase where it shouldn't accept new work).
> Waiting just a few seconds is normally enough to clear up a good percentage
> of the open requests without error, thus reducing the overhead associated
> with restarting lots of datanodes in rapid succession.
> Obviously would need a timeout to make sure the datanode doesn't wait forever.
--
This message was sent by Atlassian JIRA
(v6.1#6144)