[jira] [Updated] (MAPREDUCE-6754) Container Ids for an yarn application should be monotonically increasing in the scope of the application
[ https://issues.apache.org/jira/browse/MAPREDUCE-6754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] D M Murali Krishna Reddy updated MAPREDUCE-6754: Status: Patch Available (was: Open) > Container Ids for an yarn application should be monotonically increasing in > the scope of the application > > > Key: MAPREDUCE-6754 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6754 > Project: Hadoop Map/Reduce > Issue Type: Sub-task >Reporter: Srikanth Sampath >Assignee: D M Murali Krishna Reddy >Priority: Major > Attachments: MAPREDUCE-6754-001.patch > > > Currently across application attempts, container Ids are reused. The > container id is stored in AppSchedulingInfo and it is reinitialized with > every application attempt. So the containerId scope is limited to the > application attempt. > In the MR Framework, It is important to note that the containerId is being > used as part of the JvmId. JvmId has 3 components containerId>. The JvmId is used in datastructures in TaskAttemptListener and > is passed between the AppMaster and the individual tasks. For an application > attempt, no two tasks have the same JvmId. > This works well currently, since inflight tasks get killed if the AppMaster > goes down. However, if we want to enable WorkPreserving nature for the AM, > containers (and hence containerIds) live across application attempts. If we > recycle containerIds across attempts, then two independent tasks (one > inflight from a previous attempt and another new in a succeeding attempt) > can have the same JvmId and cause havoc. > This can be solved in two ways: > *Approach A*: Include attempt id as part of the JvmId. This is a viable > solution, however, there is a change in the format of the JVMid. Changing > something that has existed so long for an optional feature is not persuasive. > *Approach B*: Keep the container id to be a monotonically increasing id for > the life of an application. So, container ids are not reused across > application attempts containers should be able to outlive an application > attempt. This is the preferred approach as it is clean, logical and is > backwards compatible. Nothing changes for existing applications or the > internal workings. > *How this is achieved:* > Currently, we maintain latest containerId only for application attempts and > reinitialize for new attempts. With this approach, we retrieve the latest > containerId from the just-failed attempt and initialize the new attempt with > the latest containerId (instead of 0). I can provide the patch if it helps. > It currently exists in MAPREDUCE-6726 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6754) Container Ids for an yarn application should be monotonically increasing in the scope of the application
[ https://issues.apache.org/jira/browse/MAPREDUCE-6754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] D M Murali Krishna Reddy updated MAPREDUCE-6754: Attachment: MAPREDUCE-6754-001.patch > Container Ids for an yarn application should be monotonically increasing in > the scope of the application > > > Key: MAPREDUCE-6754 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6754 > Project: Hadoop Map/Reduce > Issue Type: Sub-task >Reporter: Srikanth Sampath >Assignee: D M Murali Krishna Reddy >Priority: Major > Attachments: MAPREDUCE-6754-001.patch > > > Currently across application attempts, container Ids are reused. The > container id is stored in AppSchedulingInfo and it is reinitialized with > every application attempt. So the containerId scope is limited to the > application attempt. > In the MR Framework, It is important to note that the containerId is being > used as part of the JvmId. JvmId has 3 components containerId>. The JvmId is used in datastructures in TaskAttemptListener and > is passed between the AppMaster and the individual tasks. For an application > attempt, no two tasks have the same JvmId. > This works well currently, since inflight tasks get killed if the AppMaster > goes down. However, if we want to enable WorkPreserving nature for the AM, > containers (and hence containerIds) live across application attempts. If we > recycle containerIds across attempts, then two independent tasks (one > inflight from a previous attempt and another new in a succeeding attempt) > can have the same JvmId and cause havoc. > This can be solved in two ways: > *Approach A*: Include attempt id as part of the JvmId. This is a viable > solution, however, there is a change in the format of the JVMid. Changing > something that has existed so long for an optional feature is not persuasive. > *Approach B*: Keep the container id to be a monotonically increasing id for > the life of an application. So, container ids are not reused across > application attempts containers should be able to outlive an application > attempt. This is the preferred approach as it is clean, logical and is > backwards compatible. Nothing changes for existing applications or the > internal workings. > *How this is achieved:* > Currently, we maintain latest containerId only for application attempts and > reinitialize for new attempts. With this approach, we retrieve the latest > containerId from the just-failed attempt and initialize the new attempt with > the latest containerId (instead of 0). I can provide the patch if it helps. > It currently exists in MAPREDUCE-6726 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6754) Container Ids for an yarn application should be monotonically increasing in the scope of the application
[ https://issues.apache.org/jira/browse/MAPREDUCE-6754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Srikanth Sampath updated MAPREDUCE-6754: Summary: Container Ids for an yarn application should be monotonically increasing in the scope of the application (was: Container Ids for a yarn application should be monotonically increasing in the scope of the application) > Container Ids for an yarn application should be monotonically increasing in > the scope of the application > > > Key: MAPREDUCE-6754 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6754 > Project: Hadoop Map/Reduce > Issue Type: Sub-task >Reporter: Srikanth Sampath >Assignee: Srikanth Sampath > > Currently across application attempts, container Ids are reused. The > container id is stored in AppSchedulingInfo and it is reinitialized with > every application attempt. So the containerId scope is limited to the > application attempt. > In the MR Framework, It is important to note that the containerId is being > used as part of the JvmId. JvmId has 3 componentscontainerId>. The JvmId is used in datastructures in TaskAttemptListener and > is passed between the AppMaster and the individual tasks. For an application > attempt, no two tasks have the same JvmId. > This works well currently, since inflight tasks get killed if the AppMaster > goes down. However, if we want to enable WorkPreserving nature for the AM, > containers (and hence containerIds) live across application attempts. If we > recycle containerIds across attempts, then two independent tasks (one > inflight from a previous attempt and another new in a succeeding attempt) > can have the same JvmId and cause havoc. > This can be solved in two ways: > *Approach A*: Include attempt id as part of the JvmId. This is a viable > solution, however, there is a change in the format of the JVMid. Changing > something that has existed so long for an optional feature is not persuasive. > *Approach B*: Keep the container id to be a monotonically increasing id for > the life of an application. So, container ids are not reused across > application attempts containers should be able to outlive an application > attempt. This is the preferred approach as it is clean, logical and is > backwards compatible. Nothing changes for existing applications or the > internal workings. > *How this is achieved:* > Currently, we maintain latest containerId only for application attempts and > reinitialize for new attempts. With this approach, we retrieve the latest > containerId from the just-failed attempt and initialize the new attempt with > the latest containerId (instead of 0). I can provide the patch if it helps. > It currently exists in MAPREDUCE-6726 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org