Nathan Roberts created HDFS-5446:
------------------------------------
Summary: Consider supporting a mechanism to allow datanodes to
drain outstanding work during rolling upgrade
Key: HDFS-5446
URL: https://issues.apache.org/jira/browse/HDFS-5446
Project: Hadoop HDFS
Issue Type: Improvement
Components: datanode
Affects Versions: 2.2.0
Reporter: Nathan Roberts
Rebuilding write pipelines is expensive and this can happen many times during a
rolling restart of datanodes (i.e. during a rolling upgrade). It seems like it
might help if datanodes could be told to drain current work while rejecting new
requests - possibly with a new response indicating the node is temporarily
unavailable (it's not broken, it's just going through a maintenance phase where
it shouldn't accept new work).
Waiting just a few seconds is normally enough to clear up a good percentage of
the open requests without error, thus reducing the overhead associated with
restarting lots of datanodes in rapid succession.
Obviously would need a timeout to make sure the datanode doesn't wait forever.
--
This message was sent by Atlassian JIRA
(v6.1#6144)