[ 
https://issues.apache.org/jira/browse/HADOOP-1431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12500447
 ] 

Doug Cutting commented on HADOOP-1431:
--------------------------------------

Sigh. I wish this just started a new thread around each call to sortAndSpill, 
as I suggetested above, something like:

try {
   Thread progress = createProgressThread(umbilical);
   sortAndSpill();
} finally {
   progress.interrupt();
}

As it stands, the call to stop the thread is in a finally, but after other 
things that could throw exceptions, so there's no guarantee that the thread 
will actually exit.  And the calls to pause the thread are not in a finally at 
all, so, if there's an exception in sorting, progress will not stop.  Reusing a 
thread seems like a premature optimization that opens up lots of possible error 
modes that we don't need.  I think rather we should simply narrow the scope of 
the prior logic.  Threads are plenty cheap for this and I don't see the 
optimization is worth either the risks it adds nor the increased code to 
maintain.


> Map tasks can't timeout for failing to call progress
> ----------------------------------------------------
>
>                 Key: HADOOP-1431
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1431
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.13.0
>            Reporter: Owen O'Malley
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.13.0
>
>         Attachments: HADOOP-1431_1_20070525.patch, 
> HADOOP-1431_2_20070530.patch
>
>
> Currently the map task runner creates a thread that calls progress every 
> second to keep the system from killing the map if the sort takes too long. 
> This is the wrong approach, because it will cause stuck tasks to not be 
> killed. The right solution is to have the sort call progress as it actually 
> makes progress. This is part of what is going on in HADOOP-1374. A map gets 
> stuck at 100% progress, but not done.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to