[
https://issues.apache.org/jira/browse/MAPREDUCE-2952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13113505#comment-13113505
]
Vinod Kumar Vavilapalli commented on MAPREDUCE-2952:
----------------------------------------------------
bq. I'll test tmrw on a real cluster.
Please do. The most common case we hit is when AMContainer gets killed by the
NM for transgressing memory limits. You can try setting very high
heap-size(>1.5G) for AMContainer and verify that AM gets killed by NM for 2GB
Vmem limit *and* that the error propagates to the client side.
> Application failure diagnostics are not consumed in a couple of cases
> ---------------------------------------------------------------------
>
> Key: MAPREDUCE-2952
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2952
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: mrv2, resourcemanager
> Affects Versions: 0.23.0
> Reporter: Vinod Kumar Vavilapalli
> Assignee: Arun C Murthy
> Priority: Blocker
> Fix For: 0.23.0
>
> Attachments: MAPREDUCE-2952.patch, MAPREDUCE-2952.patch
>
>
> When Container crashes, the reason for failures isn't propagated because of a
> bug in _RMAppAttemptImpl.AMContainerCrashedTransition_ which simply discards
> the diagnostics of the container. Also RMAppAttemptImpl.diagnostics is never
> consumed.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira