[
https://issues.apache.org/jira/browse/HDFS-5446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13900344#comment-13900344
]
Kihwal Lee commented on HDFS-5446:
----------------------------------
After the OOB acking feature, I believe we can make DN tell writers to move out
more easily. Although this is less useful for rolling upgrades, it can solve
the problem of decommissioning nodes with long slow writers. Clients will be
able to migrate their writes to another node, so even the blocks with single
replica will continue to work.
> Consider supporting a mechanism to allow datanodes to drain outstanding work
> during rolling upgrade
> ---------------------------------------------------------------------------------------------------
>
> Key: HDFS-5446
> URL: https://issues.apache.org/jira/browse/HDFS-5446
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: datanode
> Affects Versions: 2.2.0
> Reporter: Nathan Roberts
>
> Rebuilding write pipelines is expensive and this can happen many times during
> a rolling restart of datanodes (i.e. during a rolling upgrade). It seems like
> it might help if datanodes could be told to drain current work while
> rejecting new requests - possibly with a new response indicating the node is
> temporarily unavailable (it's not broken, it's just going through a
> maintenance phase where it shouldn't accept new work).
> Waiting just a few seconds is normally enough to clear up a good percentage
> of the open requests without error, thus reducing the overhead associated
> with restarting lots of datanodes in rapid succession.
> Obviously would need a timeout to make sure the datanode doesn't wait forever.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)