[jira] [Updated] (YARN-3675) FairScheduler: RM quits when node removal races with continousscheduling on the same node
[ https://issues.apache.org/jira/browse/YARN-3675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-3675: - Fix Version/s: 2.8.0 > FairScheduler: RM quits when node removal races with continousscheduling on > the same node > - > > Key: YARN-3675 > URL: https://issues.apache.org/jira/browse/YARN-3675 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot >Priority: Critical > Fix For: 2.8.0, 2.7.1, 3.0.0-alpha1 > > Attachments: YARN-3675.001.patch, YARN-3675.002.patch, > YARN-3675.003.patch > > > With continuous scheduling, scheduling can be done on a node thats just > removed causing errors like below. > {noformat} > 12:28:53.782 AM FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager > Error in handling event type APP_ATTEMPT_REMOVED to the scheduler > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.unreserve(FSAppAttempt.java:469) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.completedContainer(FairScheduler.java:815) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.removeApplicationAttempt(FairScheduler.java:763) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1217) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:111) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:684) > at java.lang.Thread.run(Thread.java:745) > 12:28:53.783 AMINFO > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager Exiting, bbye.. > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-3675) FairScheduler: RM quits when node removal races with continousscheduling on the same node
[ https://issues.apache.org/jira/browse/YARN-3675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-3675: --- Priority: Critical (was: Major) FairScheduler: RM quits when node removal races with continousscheduling on the same node - Key: YARN-3675 URL: https://issues.apache.org/jira/browse/YARN-3675 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Priority: Critical Attachments: YARN-3675.001.patch, YARN-3675.002.patch, YARN-3675.003.patch With continuous scheduling, scheduling can be done on a node thats just removed causing errors like below. {noformat} 12:28:53.782 AM FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager Error in handling event type APP_ATTEMPT_REMOVED to the scheduler java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.unreserve(FSAppAttempt.java:469) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.completedContainer(FairScheduler.java:815) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.removeApplicationAttempt(FairScheduler.java:763) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1217) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:111) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:684) at java.lang.Thread.run(Thread.java:745) 12:28:53.783 AMINFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager Exiting, bbye.. {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3675) FairScheduler: RM quits when node removal races with continousscheduling on the same node
[ https://issues.apache.org/jira/browse/YARN-3675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-3675: --- Target Version/s: 2.7.1 FairScheduler: RM quits when node removal races with continousscheduling on the same node - Key: YARN-3675 URL: https://issues.apache.org/jira/browse/YARN-3675 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Priority: Critical Attachments: YARN-3675.001.patch, YARN-3675.002.patch, YARN-3675.003.patch With continuous scheduling, scheduling can be done on a node thats just removed causing errors like below. {noformat} 12:28:53.782 AM FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager Error in handling event type APP_ATTEMPT_REMOVED to the scheduler java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.unreserve(FSAppAttempt.java:469) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.completedContainer(FairScheduler.java:815) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.removeApplicationAttempt(FairScheduler.java:763) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1217) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:111) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:684) at java.lang.Thread.run(Thread.java:745) 12:28:53.783 AMINFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager Exiting, bbye.. {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3675) FairScheduler: RM quits when node removal races with continousscheduling on the same node
[ https://issues.apache.org/jira/browse/YARN-3675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-3675: Attachment: YARN-3675.001.patch FairScheduler: RM quits when node removal races with continousscheduling on the same node - Key: YARN-3675 URL: https://issues.apache.org/jira/browse/YARN-3675 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Attachments: YARN-3675.001.patch With continuous scheduling, scheduling can be done on a node thats just removed causing errors like below. {noformat} 12:28:53.782 AM FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager Error in handling event type APP_ATTEMPT_REMOVED to the scheduler java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.unreserve(FSAppAttempt.java:469) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.completedContainer(FairScheduler.java:815) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.removeApplicationAttempt(FairScheduler.java:763) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1217) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:111) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:684) at java.lang.Thread.run(Thread.java:745) 12:28:53.783 AMINFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager Exiting, bbye.. {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3675) FairScheduler: RM quits when node removal races with continousscheduling on the same node
[ https://issues.apache.org/jira/browse/YARN-3675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-3675: Attachment: YARN-3675.002.patch Fixed checkstyle issue FairScheduler: RM quits when node removal races with continousscheduling on the same node - Key: YARN-3675 URL: https://issues.apache.org/jira/browse/YARN-3675 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Attachments: YARN-3675.001.patch, YARN-3675.002.patch With continuous scheduling, scheduling can be done on a node thats just removed causing errors like below. {noformat} 12:28:53.782 AM FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager Error in handling event type APP_ATTEMPT_REMOVED to the scheduler java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.unreserve(FSAppAttempt.java:469) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.completedContainer(FairScheduler.java:815) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.removeApplicationAttempt(FairScheduler.java:763) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1217) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:111) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:684) at java.lang.Thread.run(Thread.java:745) 12:28:53.783 AMINFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager Exiting, bbye.. {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3675) FairScheduler: RM quits when node removal races with continousscheduling on the same node
[ https://issues.apache.org/jira/browse/YARN-3675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-3675: Attachment: YARN-3675.003.patch Removed spurious changes and changed visibility of attemptScheduling FairScheduler: RM quits when node removal races with continousscheduling on the same node - Key: YARN-3675 URL: https://issues.apache.org/jira/browse/YARN-3675 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Attachments: YARN-3675.001.patch, YARN-3675.002.patch, YARN-3675.003.patch With continuous scheduling, scheduling can be done on a node thats just removed causing errors like below. {noformat} 12:28:53.782 AM FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager Error in handling event type APP_ATTEMPT_REMOVED to the scheduler java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.unreserve(FSAppAttempt.java:469) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.completedContainer(FairScheduler.java:815) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.removeApplicationAttempt(FairScheduler.java:763) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1217) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:111) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:684) at java.lang.Thread.run(Thread.java:745) 12:28:53.783 AMINFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager Exiting, bbye.. {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)