[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-3006:
-----------------------------------------------

    Attachment: MAPREDUCE-3006-20110917.txt

Updated patch.

bq. [..] the unused conf variable (findbugs) 
done

bq. ContainerLauncherRouter needs to implement stop to stop the actual 
ContainerLauncher.
Good catch! Done.

bq. Job doesn't really follow the lifecycle of the services - but would it make 
sense to create the job and send the JOB_INIT event in MRAM.init() after a 
super.init(). Job creation uses completedTasksFromPreviousRun which is only 
populated after the RecoveryService is initialized.
I observed the {{completedTasksFromPreviousRun}} issue before itself. But it 
works okay as the reference is stored, it is more importantly backed by the 
underlying map, and not used till the job actually starts.
In any case, creating events and trying to move it after _super.init()_ clearly 
tells me it belongs to the _start()_ life-cycle. So moved it there.

> MapReduce AM exits prematurely before completely writing and closing the 
> JobHistory file
> ----------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3006
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3006
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: applicationmaster, mrv2
>    Affects Versions: 0.23.0
>            Reporter: Vinod Kumar Vavilapalli
>            Assignee: Vinod Kumar Vavilapalli
>             Fix For: 0.23.0
>
>         Attachments: MAPREDUCE-3006-20110915.txt, 
> MAPREDUCE-3006-20110916.txt, MAPREDUCE-3006-20110917.txt
>
>
> [~Karams] was executing a sleep job with 100,000 tasks on a 350 node cluster 
> to test MR AM's scalability and ran into this. The job ran successfully but 
> the history was not available.
> I debugged around and figured that the job is finishing prematurely before 
> the JobHistory is written. In most of the cases, we don't see this bug as we 
> have a 5 seconds sleep in AM towards the end.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to