[
https://issues.apache.org/jira/browse/MAPREDUCE-6754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15436130#comment-15436130
]
Vinod Kumar Vavilapalli commented on MAPREDUCE-6754:
----------------------------------------------------
bq. Approach A: Include attempt id as part of the JvmId. This is a viable
solution, however, there is a change in the format of the JVMid. Changing
something that has existed so long for an optional feature is not persuasive.
bq. I don't understand the concern about changing the JvmID. It's not really
public and only used within the scope of a single job
Agreed.
[~srikanth.sampath], JvmID originally was added to implement JVM reuse in
Hadoop 1 MapReduce. When we moved to YARN + MR, we lost JVM reuse and I doubt
if we are going to implement that now. So, I'd argue that we can completely
remove JvmID. But if that's too much, like [~jlowe] says we can simply change
the format of JvmID - it is not a public API.
> Container Ids for an yarn application should be monotonically increasing in
> the scope of the application
> --------------------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-6754
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6754
> Project: Hadoop Map/Reduce
> Issue Type: Sub-task
> Reporter: Srikanth Sampath
> Assignee: Srikanth Sampath
>
> Currently across application attempts, container Ids are reused. The
> container id is stored in AppSchedulingInfo and it is reinitialized with
> every application attempt. So the containerId scope is limited to the
> application attempt.
> In the MR Framework, It is important to note that the containerId is being
> used as part of the JvmId. JvmId has 3 components <jobId, "m/r?",
> containerId>. The JvmId is used in datastructures in TaskAttemptListener and
> is passed between the AppMaster and the individual tasks. For an application
> attempt, no two tasks have the same JvmId.
> This works well currently, since inflight tasks get killed if the AppMaster
> goes down. However, if we want to enable WorkPreserving nature for the AM,
> containers (and hence containerIds) live across application attempts. If we
> recycle containerIds across attempts, then two independent tasks (one
> inflight from a previous attempt and another new in a succeeding attempt)
> can have the same JvmId and cause havoc.
> This can be solved in two ways:
> *Approach A*: Include attempt id as part of the JvmId. This is a viable
> solution, however, there is a change in the format of the JVMid. Changing
> something that has existed so long for an optional feature is not persuasive.
> *Approach B*: Keep the container id to be a monotonically increasing id for
> the life of an application. So, container ids are not reused across
> application attempts containers should be able to outlive an application
> attempt. This is the preferred approach as it is clean, logical and is
> backwards compatible. Nothing changes for existing applications or the
> internal workings.
> *How this is achieved:*
> Currently, we maintain latest containerId only for application attempts and
> reinitialize for new attempts. With this approach, we retrieve the latest
> containerId from the just-failed attempt and initialize the new attempt with
> the latest containerId (instead of 0). I can provide the patch if it helps.
> It currently exists in MAPREDUCE-6726
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]