[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Payne updated MAPREDUCE-4937:
----------------------------------

    Attachment: MAPREDUCE-4937.MRAMHandlOversizeSplits.txt

Thank you for the comments, Jason.

bq. ◾Instead of duplicating the super.serviceStart() and having a mid-method 
return, we could do something like this:
I rearranged the code as suggested, with one adjustment. Rather than setting 
initFailed to true if internal state is NEW, initFailed is set to true if 
internal state is not INITED.

bq. ◾Nit: JOB_INIT_FAILED should be listed by the other events produced by 
MRAppMaster (i.e.: JOB_INIT, JOB_START) so like the other events it's 
documented where the event originates
Moved JOB_INIT_FAILED

bq. ◾Nit: would be nice to have some better indentation on the wrapped lines in 
the test case, since those lines have smaller indents than the method 
declaration
Fixed indentation

> MR AM handles an oversized split metainfo file poorly
> -----------------------------------------------------
>
>                 Key: MAPREDUCE-4937
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4937
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mr-am
>    Affects Versions: 2.0.2-alpha, 0.23.5
>            Reporter: Jason Lowe
>            Assignee: Eric Payne
>         Attachments: MAPREDUCE-4937.MRAMHandlOversizeSplits.txt, 
> MAPREDUCE-4937.MRAMHandlOversizeSplits.txt, 
> MAPREDUCE-4937.MRAMHandlOversizeSplits.txt
>
>
> When an job runs with a split metainfo file that's larger than it has been 
> configured to handle then it just crashes.  This leaves the user with a 
> less-than-ideal debug session since there are no useful diagnostic messages 
> sent to the client for this failure.  In addition it crashes before 
> registering/unregistering with the RM and crashes without generating history, 
> so the proxy URL is not very useful and there's no archived configuration to 
> check to see what setting the AM was using when it encountered the error.
> The AM should handle this error case more gracefully and treat the failure as 
> it does any other failed job, with a proper unregistration from the RM and 
> with history.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to