[jira] [Commented] (HDFS-6227) Short circuit read failed due to ClosedChannelException

Colin Patrick McCabe (JIRA) Thu, 22 May 2014 16:25:30 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-6227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14006598#comment-14006598
 ]


Colin Patrick McCabe commented on HDFS-6227:
--------------------------------------------

It seems that whenever you deliver an {{InterruptedException}} while in 
{{FileChannel#read}}, the channel is immediately closed.  This causes problems 
for short-circuit reads, since multiple threads may be (p)reading from a single 
pair of file descriptors (for the block and the checksum).

We can certainly check if either channel was closed in 
{{BlockReaderLocal#close}}, and mark the replica as stale in that case.  That 
will limit the harm somewhat.  But there isn't any easy way to save concurrent 
readers from getting the same {{ClosedChannelException}}.  Theoretically we 
could wrap every call to {{blockReader#read}} in a retry block that treated 
this problem differently from a regular I/O error.  But there are a lot of 
callers and that retry code is already a little complex.

For the purposes of YARN, I think checking whether the channel is closed in 
{{BlockReaderLocal#close}} is enough, since each container will only be running 
one thing at a time, as I understand.

> Short circuit read failed due to ClosedChannelException
> -------------------------------------------------------
>
>                 Key: HDFS-6227
>                 URL: https://issues.apache.org/jira/browse/HDFS-6227
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.4.0
>            Reporter: Jing Zhao
>            Assignee: Colin Patrick McCabe
>         Attachments: HDFS-6227.000.patch, 
> ShortCircuitReadInterruption.test.patch
>
>
> While running tests in a single node cluster, where short circuit read is 
> enabled and multiple threads may read the same file concurrently, one of the 
> read got ClosedChannelException and failed. Full exception trace see comment.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6227) Short circuit read failed due to ClosedChannelException

Reply via email to