[ 
https://issues.apache.org/jira/browse/MAPREDUCE-913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated MAPREDUCE-913:
----------------------------------------------

    Attachment: patch-913.txt

Patch does the following:
1. changed reportTaskFinished code to ensure release slot happens always by 
calling releaseSlot in finally block.
2. Have undone the changes to do with throwing exception when arguments to 
debug-script could not be constructed, as it was already initializing them to 
empty String.
3. Modified the testcase to use new api.

bq.  In test case can we verify the correct number of the map slot is actually 
reported back to JobTracker after the failing job completes, this would test 
the actual slot management.
4. Added asserts for slot management. Verified the test passes with the patch 
and fails without the patch.

bq. Can we check if the workDir is non-null in the run-debug script and throw 
an exception if the same is null? Would prevent launch of task-controller code.
If workdDir is null or if it doesnt exists, the current code already throws 
IOException.

bq. Wouldn't it be much better that we add a check to figure out if the taskJVM 
was launched or not and then run debug script accordingly.
This may need more discussion, since it changes the feature in a way that debug 
script will be launched only when taskJvm is launched properly.


> TaskRunner crashes with NPE resulting in held up slots, UNINITIALIZED tasks 
> and hung TaskTracker
> ------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-913
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-913
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>    Affects Versions: 0.20.1
>            Reporter: Vinod K V
>            Priority: Blocker
>             Fix For: 0.21.0
>
>         Attachments: mapreduce-913-1.patch, MAPREDUCE-913-20091119.1.txt, 
> MAPREDUCE-913-20091119.2.txt, MAPREDUCE-913-20091120.1.txt, patch-913.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to