[
https://issues.apache.org/jira/browse/HDFS-16127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kihwal Lee updated HDFS-16127:
------------------------------
Description: When a block is being closed, the data streamer in the client
waits for the final ACK to be delivered. If an exception is received during
this wait, the close is retried. This assumption has become invalid by
HDFS-15813, resulting in permanent write failures in some close error cases
involving slow nodes. There are also less frequent cases of data loss. (was:
While waiting for the final ack for the empty close packet, the main
DataStreamer thread can receive an exception even when the final ack was
received and pipelines close normally. This leads to an unnecessary close
recovery that results in a permanent write failure or a silent data loss.
)
> Improper pipeline close recovery causes a permanent write failure or data
> loss.
> -------------------------------------------------------------------------------
>
> Key: HDFS-16127
> URL: https://issues.apache.org/jira/browse/HDFS-16127
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: Kihwal Lee
> Priority: Major
>
> When a block is being closed, the data streamer in the client waits for the
> final ACK to be delivered. If an exception is received during this wait, the
> close is retried. This assumption has become invalid by HDFS-15813, resulting
> in permanent write failures in some close error cases involving slow nodes.
> There are also less frequent cases of data loss.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]