[
https://issues.apache.org/jira/browse/HADOOP-5286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675169#action_12675169
]
Hairong Kuang commented on HADOOP-5286:
---------------------------------------
> I checked the data node mentioned in the exception traces of the attached
> file. It was slow to the point of being dead, though ping was responding.
I believe that the reading failure was caused by the slow datanode. I do not
think that DFSClient could be blocked for 1 and half an hour. From the attached
log, I do see that the same split file was read again and again. Does JobTrack
reinsert a failed job back into the jobInitQueue?
> DFS client blocked for a long time reading blocks of a file on the JobTracker
> -----------------------------------------------------------------------------
>
> Key: HADOOP-5286
> URL: https://issues.apache.org/jira/browse/HADOOP-5286
> Project: Hadoop Core
> Issue Type: Bug
> Components: dfs
> Affects Versions: 0.20.0
> Reporter: Hemanth Yamijala
> Attachments: jt-log-for-blocked-reads.txt
>
>
> On a large cluster, we've observed that DFS client was blocked on reading a
> block of a file for almost 1 and half hours. The file was being read by the
> JobTracker of the cluster, and was a split file of a job. On the NameNode
> logs, we observed that the block had a message as follows:
> Inconsistent size for block blk_2044238107768440002_840946 reported from
> <ip>:<port> current size is 195072 reported size is 1318567
> Details follow.
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.