Race condition while launching task cleanup attempt.
----------------------------------------------------

                 Key: MAPREDUCE-1475
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1475
             Project: Hadoop Map/Reduce
          Issue Type: Bug
          Components: tasktracker
    Affects Versions: 0.20.1
            Reporter: Amareshwari Sriramadasu


We found a race condition while launching task cleanup attempt on a TaskTracker 
which would eat up a slot.

The scenario is the following:
The main attempt is killed by TaskTracker because it was a speculative attempt. 
Cleanup attempt is launched on the same tracker. Cleanup attempt occupied the 
slot and is about to start. But, there was a pending RPC: done() from earlier 
attempt in the RPC queue. Before the cleanup attempt could be launched, 
TaskTracker processed the rpc from earlier attempt and made the state of the 
cleanup attempt as KILLED. Launcher did not launch it because it was already 
KILLED. But, the rpc done() failed with NullPointerException because of false 
state. In summary, the slot was occupied by the cleanup attempt which could not 
be launched. And the slot was never released.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to