[jira] Commented: (HADOOP-3130) Shuffling takes too long to get the last map output.

Devaraj Das (JIRA) Sun, 30 Mar 2008 01:39:51 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-3130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12583421#action_12583421
 ]


Devaraj Das commented on HADOOP-3130:
-------------------------------------

The events are stored in the jobtracker and fetched by the tasktrackers. This 
frequency of polling for map completion events is same as the 
heartbeat-interval (which depends on the cluster size). For e.g., if cluster 
size is of 500 nodes it is going to be 10 seconds. Now the reason for the order 
of minutes delay in getting map completion events could be that the map is not 
complete yet (it's still in COMMIT_PENDING or RUNNING), or, the JobTracker is 
busy and is discarding RPCs. To ascertain the latter, you should take a look at 
the reducer's host tasktracker logs.

> Shuffling takes too long to get the last map output.
> ----------------------------------------------------
>
>                 Key: HADOOP-3130
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3130
>             Project: Hadoop Core
>          Issue Type: Bug
>            Reporter: Runping Qi
>         Attachments: shuffling.log
>
>
> I noticed that towards the end of shufflling, the map output fetcher of the 
> reducer backs off too aggressively.
> I attach a fraction of one reduce log of my job.
> Noticed that the last map output was not fetched in 2 minutes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3130) Shuffling takes too long to get the last map output.

Reply via email to