[
https://issues.apache.org/jira/browse/HDFS-9752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15133462#comment-15133462
]
Xiaobing Zhou commented on HDFS-9752:
-------------------------------------
Thanks [~walter.k.su] for this critical fix, [~kihwal] for the reporting. I've
been debugging this for while.
1. Does it make sense to make pipelineRecoveryCount reconfigurable?
2. Without the patch, it looks like not always true that there must be 'a
permanent write failure can occur.', you mentioned. I attached a simple test.
1). It writes data with hflush and hangs on there by holding the stream open
2). manually bring down one of DNs in pipeline, recovery will be triggered
3). 5 times later, pipeline will be closed. Data is successfully written to
HDFS.
Can you explain why there is no failure in the case? The only reason could be
data is too small? Thanks.
> Permanent write failures may happen to slow writers during datanode rolling
> upgrades
> ------------------------------------------------------------------------------------
>
> Key: HDFS-9752
> URL: https://issues.apache.org/jira/browse/HDFS-9752
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: Kihwal Lee
> Assignee: Walter Su
> Priority: Critical
> Attachments: HDFS-9752.01.patch
>
>
> When datanodes are being upgraded, an out-of-band ack is sent upstream and
> the client does a pipeline recovery. The client may hit this multiple times
> as more nodes get upgraded. This normally does not cause any issue, but if
> the client is holding the stream open without writing any data during this
> time, a permanent write failure can occur.
> This is because there is a limit of 5 recovery trials for the same packet,
> which is tracked by "last acked sequence number". Since the empty heartbeat
> packets for an idle output stream does not increment the sequence number, the
> write will fail after it seeing 5 pipeline breakages by datanode upgrades.
> This check/limit was added to avoid spinning until running out of nodes in
> the cluster due to a corruption or any other irrecoverable conditions. The
> datanode upgrade-restart should be excluded from the count.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)