[jira] [Commented] (YARN-4750) App metrics may not be correct when an app is recovered
[ https://issues.apache.org/jira/browse/YARN-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15175105#comment-15175105 ] Srikanth Sampath commented on YARN-4750: Agree [~jianhe]that it would be expensive to do update periodically. However, it will be useful to indicate that the metrics are compromised. One option can be to set the value to a special value (say a negative number) so as to to indicate a compromised value one time. Just carrying on silently, can be misleading. > App metrics may not be correct when an app is recovered > --- > > Key: YARN-4750 > URL: https://issues.apache.org/jira/browse/YARN-4750 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Lavkesh Lahngir >Assignee: Lavkesh Lahngir > > App metrics(rather app attempt metrics) like Vcore-seconds and MB-seconds are > saved in the state store when there is an attempt state transition. Values > for running attempts will be in memory and will not be saved when there is an > RM restart/failover. For recovered app metrics value will be reset. In that > case, these values will be incomplete. > Was this intentional or have we not found a correct way to fix it? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2571) RM to support YARN registry
[ https://issues.apache.org/jira/browse/YARN-2571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14726634#comment-14726634 ] Srikanth Sampath commented on YARN-2571: Thanks [~steve_l]. I will ping you directly. > RM to support YARN registry > > > Key: YARN-2571 > URL: https://issues.apache.org/jira/browse/YARN-2571 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.6.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Labels: BB2015-05-TBR > Attachments: YARN-2571-001.patch, YARN-2571-002.patch, > YARN-2571-003.patch, YARN-2571-005.patch, YARN-2571-007.patch, > YARN-2571-008.patch, YARN-2571-009.patch, YARN-2571-010.patch > > > The RM needs to (optionally) integrate with the YARN registry: > # startup: create the /services and /users paths with system ACLs (yarn, hdfs > principals) > # app-launch: create the user directory /users/$username with the relevant > permissions (CRD) for them to create subnodes. > # attempt, container, app completion: remove service records with the > matching persistence and ID -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3972) Work Preserving AM Restart for MapReduce
[ https://issues.apache.org/jira/browse/YARN-3972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14710927#comment-14710927 ] Srikanth Sampath commented on YARN-3972: Upon discussing with [~vvasudev] exploring using Yarn Service Registry for Containers to locate the MR AppMaster. Work Preserving AM Restart for MapReduce Key: YARN-3972 URL: https://issues.apache.org/jira/browse/YARN-3972 Project: Hadoop YARN Issue Type: Bug Reporter: Srikanth Sampath Assignee: Raju Bairishetti Attachments: WorkPreservingMRAppMaster.pdf Providing a framework for work preserving AM is achieved in [YARN-1489|https://issues.apache.org/jira/browse/YARN-1489]. We would like to take advantage of this for MapReduce(MR) applications. There are some challenges which have been described in the attached document and few options discussed. We solicit feedback from the community. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2571) RM to support YARN registry
[ https://issues.apache.org/jira/browse/YARN-2571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14710946#comment-14710946 ] Srikanth Sampath commented on YARN-2571: What's the status of this patch - [~ste...@apache.org] I am considering using YARN registry for MR AppMaster in YARN-3972 and want to take some learnings from here. RM to support YARN registry Key: YARN-2571 URL: https://issues.apache.org/jira/browse/YARN-2571 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.6.0 Reporter: Steve Loughran Assignee: Steve Loughran Labels: BB2015-05-TBR Attachments: YARN-2571-001.patch, YARN-2571-002.patch, YARN-2571-003.patch, YARN-2571-005.patch, YARN-2571-007.patch, YARN-2571-008.patch, YARN-2571-009.patch, YARN-2571-010.patch The RM needs to (optionally) integrate with the YARN registry: # startup: create the /services and /users paths with system ACLs (yarn, hdfs principals) # app-launch: create the user directory /users/$username with the relevant permissions (CRD) for them to create subnodes. # attempt, container, app completion: remove service records with the matching persistence and ID -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3972) Work Preserving AM Restart for MapReduce
Srikanth Sampath created YARN-3972: -- Summary: Work Preserving AM Restart for MapReduce Key: YARN-3972 URL: https://issues.apache.org/jira/browse/YARN-3972 Project: Hadoop YARN Issue Type: Bug Reporter: Srikanth Sampath Providing a framework for work preserving AM is achieved in [YARN-1489|https://issues.apache.org/jira/browse/YARN-1489]. We would like to take advantage of this for MapReduce(MR) applications. There are some challenges which have been described in the attached document and few options discussed. We solicit feedback from the community. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3972) Work Preserving AM Restart for MapReduce
[ https://issues.apache.org/jira/browse/YARN-3972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Srikanth Sampath updated YARN-3972: --- Attachment: WorkPreservingMRAppMaster.pdf Work Preserving AM Restart for MapReduce Key: YARN-3972 URL: https://issues.apache.org/jira/browse/YARN-3972 Project: Hadoop YARN Issue Type: Bug Reporter: Srikanth Sampath Assignee: Raju Bairishetti Attachments: WorkPreservingMRAppMaster.pdf Providing a framework for work preserving AM is achieved in [YARN-1489|https://issues.apache.org/jira/browse/YARN-1489]. We would like to take advantage of this for MapReduce(MR) applications. There are some challenges which have been described in the attached document and few options discussed. We solicit feedback from the community. -- This message was sent by Atlassian JIRA (v6.3.4#6332)