[ 
http://issues.apache.org/jira/browse/HADOOP-133?page=comments#action_12374421 ] 

Doug Cutting commented on HADOOP-133:
-------------------------------------

We can't always rely on cleanup/finally stuff to run.  JVMs can exit 
unexpectedly.  We hope it doesn't happen often, but we must be able to handle 
that situation.  If we need to, e.g., clean up temp files, we do that on 
startup.

The reason this was added was to handle the case where the tasktracker has 
exited and the child is somehow hung.  We must not leave stray, hung, JVMs 
around.  Thread.interrupt() is not reliable enough.  When a thread is hung, it 
will not recieve an interrupt.  I've seen this frequently when fetching, where 
socket read()  requests hang indefinitely, despite the socket having a short 
read timeout.

So I'd be happy to have this first try to exit more gracefully, but, after a 
time, it should still call exit().  The child processes do not have a pid file. 
 Once their parent has died, nothing tracks them, so they must reliably exit 
fairly quickly when their parent dies.

> the TaskTracker.Child.ping thread calls exit
> --------------------------------------------
>
>          Key: HADOOP-133
>          URL: http://issues.apache.org/jira/browse/HADOOP-133
>      Project: Hadoop
>         Type: Bug

>   Components: mapred
>     Versions: 0.1.1
>     Reporter: Owen O'Malley
>     Assignee: Owen O'Malley

>
> The TaskTracker.Child.startPinging thread calls exit if the TaskTracker 
> doesn't respond. Calling exit in a mutli-threaded program is really 
> problematic. In particular, it prevents cleanup/finally clauses from running. 
> We need to move to a model where it uses Thread.interrupt(), which means we 
> need to check the interrupt flag in place in the map loop and reduce loop and 
> stop masking the InterruptExceptions.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply via email to