[
https://issues.apache.org/jira/browse/HADOOP-1431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12499140
]
Doug Cutting commented on HADOOP-1431:
--------------------------------------
Comparators can be performance-sensitive, so I'd like to see some benchmarks of
a ReportingComparator before we accept that solution.
The long-term best patch for this would be to modify SequenceFile to actually
report progress as it sorts. This is a long-standing issue.
The current bug is that the progress thread is running when there's no sorting
going on. It should only run during calls to MapTask#sortAndSpillToDisk(), not
during the entire map. This should be easy to patch for the short term.
> Map tasks can't timeout for failing to call progress
> ----------------------------------------------------
>
> Key: HADOOP-1431
> URL: https://issues.apache.org/jira/browse/HADOOP-1431
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.13.0
> Reporter: Owen O'Malley
> Assigned To: Arun C Murthy
> Fix For: 0.13.0
>
> Attachments: HADOOP-1431_1_20070525.patch
>
>
> Currently the map task runner creates a thread that calls progress every
> second to keep the system from killing the map if the sort takes too long.
> This is the wrong approach, because it will cause stuck tasks to not be
> killed. The right solution is to have the sort call progress as it actually
> makes progress. This is part of what is going on in HADOOP-1374. A map gets
> stuck at 100% progress, but not done.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.