[
https://issues.apache.org/jira/browse/HDFS-6227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jing Zhao updated HDFS-6227:
----------------------------
Attachment: ShortCircuitReadInterruption.test.patch
Checked the test log and had some offline discussion with [~vinodkv]. Looks
like the issue is that in Yarn some container got killed while reading a local
file through short circuit read. The corresponding file channel thus got
interrupted. Future readers for the same file will then hit this
ClosedChannelException when they fetch the file channel from short circuit
cache.
The attached test can regenerate the issue.
While Yarn may want to do more graceful close instead of directly interrupting
the thread while it's still blocked in the reading, maybe HDFS should also
provide better recovery mechanism for this case (e.g., to purge the closed file
channel out of the cache and retry).
> Short circuit read failed due to ClosedChannelException
> -------------------------------------------------------
>
> Key: HDFS-6227
> URL: https://issues.apache.org/jira/browse/HDFS-6227
> Project: Hadoop HDFS
> Issue Type: Bug
> Affects Versions: 2.4.0
> Reporter: Jing Zhao
> Attachments: ShortCircuitReadInterruption.test.patch
>
>
> While running tests in a single node cluster, where short circuit read is
> enabled and multiple threads may read the same file concurrently, one of the
> read got ClosedChannelException and failed. Full exception trace see comment.
--
This message was sent by Atlassian JIRA
(v6.2#6252)