[ 
https://issues.apache.org/jira/browse/HADOOP-5572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Gummadi updated HADOOP-5572:
---------------------------------

    Attachment: HADOOP-5572.v1.patch

Incorporated Jothi's 1st 3 comments.
Discussed with Jothi offline regarding comments 4 & 5. For comment 4, there 
seems to be no cleaner way, so keeping it that way. Regarding comment 5, it 
seems checking for empty segments(by reading segments) before actual merges 
seem to be costly in terms of performance. So not handling empty segments 
separately in our estimation assuming that it wouldn't hurt much in the 
approximation of mergeProgress.

Fixed an issue in informReduceProgress() by changing the call from 
Progress.get() to Progress.getInternal() because we need progress for this 
phase/node only(and not for the whole tree). Made Progress.getInternal() public.

Attaching patch with the above changes. Please review and provide your comments.

> The map progress value should have a separate phase for doing the final sort.
> -----------------------------------------------------------------------------
>
>                 Key: HADOOP-5572
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5572
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Owen O'Malley
>            Assignee: Ravi Gummadi
>         Attachments: HADOOP-5572.patch, HADOOP-5572.v1.patch
>
>
> Currently, the final spill and sort doesn't record any progress while it 
> runs, leading to the perception that the map is done, but "stuck".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to