[
https://issues.apache.org/jira/browse/HADOOP-3546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Devaraj Das updated HADOOP-3546:
--------------------------------
Status: Open (was: Patch Available)
There is a race condition in the cleanup thread due to which the thread may
never exit (a case where the interrupt is sent by the main thread but the
cleanup thread is just about to do tasksToCleanup.take(); hence the interrupt
is lost, and the cleanup thread will stay in take() for ever). Although, we
could handle the problem by introducing additional synchronization, I'd suggest
that we remove the join for the threads and instead make the threads run as
daemons. I am nervous about putting lot of code for synchronization to handle
the case where files are left over on tasktracker exit. Since, the tasktracker,
at startup, does the cleanup anyway, we should be ok.
> TaskTracker re-initialization gets stuck in cleaning up
> -------------------------------------------------------
>
> Key: HADOOP-3546
> URL: https://issues.apache.org/jira/browse/HADOOP-3546
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.18.0
> Reporter: Amareshwari Sriramadasu
> Assignee: Amareshwari Sriramadasu
> Priority: Blocker
> Fix For: 0.18.0
>
> Attachments: patch-3546.txt
>
>
> If TaskTracker gets reinit action, it is stuck in joining task cleanup
> thread.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.