[ 
https://issues.apache.org/jira/browse/HADOOP-3067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raghu Angadi updated HADOOP-3067:
---------------------------------

    Attachment: HADOOP-3067.patch

Attached patch fixes one more bug that contributed to this problem :

{{BlockReader.checksumOk()}} is part of data read protocol and it should be 
inside BlockReader class. Earlier it was invoked by {{FSInputStream.read()}} 
only because BlockReader did not have access to the socket. Now the socket is 
stored. No API is changed. Since checksumOk() was invoked only by regular read 
and not by the position read used by unit test. This combined with the fact 
that the socket was not closed made the DataNodes wait on the socket.


> DFSInputStream 'pread' does not close its sockets
> -------------------------------------------------
>
>                 Key: HADOOP-3067
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3067
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Raghu Angadi
>            Assignee: Raghu Angadi
>             Fix For: 0.17.0
>
>         Attachments: HADOOP-3067.patch, HADOOP-3067.patch
>
>
> {{DFSInputStream.read(int, buffer)}} does not close the sockets it opens. 
> Main reason this problem did not show up till now is that pread interface is 
> not used much.
> TestCrcCorruption failure first reported in HADOOP-2902 is caused by this 
> bug. Hadoop 0.17 uses more file descriptors for each thread waiting on socket 
> io. Since client does not close sockets, it leaves a lot of DataNode threads 
> waiting in the unit test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to