[
https://issues.apache.org/jira/browse/MAPREDUCE-207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13997964#comment-13997964
]
Gera Shegalov commented on MAPREDUCE-207:
-----------------------------------------
[[email protected]], thanks for your
[comment|https://issues.apache.org/jira/browse/MAPREDUCE-5887?focusedCommentId=13997431&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13997431]
in MAPREDUCE-5887. Moving it to here.
bq. One test to try there is what happens when the blocksize is reported as
very, very small (you can configure this in swiftfs). in the client this will
cause the submitting process to OOM and fail. Presumably the same outcome in
the AM is the simplest to implement -we just need to make sure that YARN
recognises this as a failure and only tries a couple of times
OOM's as any other AM failure are treated as an Application attempt failure
({{yarn.resourcemanager.am.max-attempts}}). We've experienced such issues in
production, and it is actually usually indirectly related to splits, i.e. the
job state comprising all map and reduce attempts is too big for the default
MR-AM container size.
Before doing the work on moving split calculation to MR-AM, I was actually
thinking about auto-tuning {{yarn.app.mapreduce.am.resource.mb}} and Xmx opts
in JobSubmitter. However, even if the split calculation happens in AM, we can
come up with an AM-RM RPC like "start a new attempt with the new settings".
> Computing Input Splits on the MR Cluster
> ----------------------------------------
>
> Key: MAPREDUCE-207
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-207
> Project: Hadoop Map/Reduce
> Issue Type: New Feature
> Components: applicationmaster, mrv2
> Reporter: Philip Zeyliger
> Assignee: Arun C Murthy
> Attachments: MAPREDUCE-207.patch, MAPREDUCE-207.v02.patch,
> MAPREDUCE-207.v03.patch
>
>
> Instead of computing the input splits as part of job submission, Hadoop could
> have a separate "job task type" that computes the input splits, therefore
> allowing that computation to happen on the cluster.
--
This message was sent by Atlassian JIRA
(v6.2#6252)