[ 
https://issues.apache.org/jira/browse/HDFS-6227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-6227:
----------------------------

    Attachment: ShortCircuitReadInterruption.test.patch

Checked the test log and had some offline discussion with [~vinodkv]. Looks 
like the issue is that in Yarn some container got killed while reading a local 
file through short circuit read. The corresponding file channel thus got 
interrupted. Future readers for the same file will then hit this 
ClosedChannelException when they fetch the file channel from short circuit 
cache.

The attached test can regenerate the issue.

While Yarn may want to do more graceful close instead of directly interrupting 
the thread while it's still blocked in the reading, maybe HDFS should also 
provide better recovery mechanism for this case (e.g., to purge the closed file 
channel out of the cache and retry).

> Short circuit read failed due to ClosedChannelException
> -------------------------------------------------------
>
>                 Key: HDFS-6227
>                 URL: https://issues.apache.org/jira/browse/HDFS-6227
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.4.0
>            Reporter: Jing Zhao
>         Attachments: ShortCircuitReadInterruption.test.patch
>
>
> While running tests in a single node cluster, where short circuit read is 
> enabled and multiple threads may read the same file concurrently, one of the 
> read got ClosedChannelException and failed. Full exception trace see comment.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to