[jira] [Updated] (YARN-2019) Retrospect on decision of making RM crashed if any exception throw in ZKRMStateStore
[ https://issues.apache.org/jira/browse/YARN-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-2019: - Fix Version/s: 2.8.0 > Retrospect on decision of making RM crashed if any exception throw in > ZKRMStateStore > > > Key: YARN-2019 > URL: https://issues.apache.org/jira/browse/YARN-2019 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Junping Du >Assignee: Jian He >Priority: Critical > Labels: ha > Fix For: 2.8.0, 2.7.2, 2.6.2, 3.0.0-alpha1 > > Attachments: YARN-2019.1-wip.patch, YARN-2019.patch, YARN-2019.patch > > > Currently, if any abnormal happens in ZKRMStateStore, it will throw a fetal > exception to crash RM down. As shown in YARN-1924, it could due to RM HA > internal bug itself, but not fatal exception. We should retrospect some > decision here as HA feature is designed to protect key component but not > disturb it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-2019) Retrospect on decision of making RM crashed if any exception throw in ZKRMStateStore
[ https://issues.apache.org/jira/browse/YARN-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-2019: -- Fix Version/s: (was: 2.8.0) > Retrospect on decision of making RM crashed if any exception throw in > ZKRMStateStore > > > Key: YARN-2019 > URL: https://issues.apache.org/jira/browse/YARN-2019 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Junping Du >Assignee: Jian He >Priority: Critical > Labels: ha > Fix For: 2.7.2, 2.6.2 > > Attachments: YARN-2019.1-wip.patch, YARN-2019.patch, YARN-2019.patch > > > Currently, if any abnormal happens in ZKRMStateStore, it will throw a fetal > exception to crash RM down. As shown in YARN-1924, it could due to RM HA > internal bug itself, but not fatal exception. We should retrospect some > decision here as HA feature is designed to protect key component but not > disturb it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2019) Retrospect on decision of making RM crashed if any exception throw in ZKRMStateStore
[ https://issues.apache.org/jira/browse/YARN-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-2019: Fix Version/s: 2.6.2 2.7.2 > Retrospect on decision of making RM crashed if any exception throw in > ZKRMStateStore > > > Key: YARN-2019 > URL: https://issues.apache.org/jira/browse/YARN-2019 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Junping Du >Assignee: Jian He >Priority: Critical > Labels: ha > Fix For: 2.8.0, 2.7.2, 2.6.2 > > Attachments: YARN-2019.1-wip.patch, YARN-2019.patch, YARN-2019.patch > > > Currently, if any abnormal happens in ZKRMStateStore, it will throw a fetal > exception to crash RM down. As shown in YARN-1924, it could due to RM HA > internal bug itself, but not fatal exception. We should retrospect some > decision here as HA feature is designed to protect key component but not > disturb it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2019) Retrospect on decision of making RM crashed if any exception throw in ZKRMStateStore
[ https://issues.apache.org/jira/browse/YARN-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated YARN-2019: -- Attachment: YARN-2019.patch > Retrospect on decision of making RM crashed if any exception throw in > ZKRMStateStore > > > Key: YARN-2019 > URL: https://issues.apache.org/jira/browse/YARN-2019 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Junping Du >Assignee: Jian He >Priority: Critical > Labels: ha > Attachments: YARN-2019.1-wip.patch, YARN-2019.patch, YARN-2019.patch > > > Currently, if any abnormal happens in ZKRMStateStore, it will throw a fetal > exception to crash RM down. As shown in YARN-1924, it could due to RM HA > internal bug itself, but not fatal exception. We should retrospect some > decision here as HA feature is designed to protect key component but not > disturb it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2019) Retrospect on decision of making RM crashed if any exception throw in ZKRMStateStore
[ https://issues.apache.org/jira/browse/YARN-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated YARN-2019: -- Attachment: YARN-2019.patch uploaded a patch which adds the "yarn.resourcemanager.fail-fast" config > Retrospect on decision of making RM crashed if any exception throw in > ZKRMStateStore > > > Key: YARN-2019 > URL: https://issues.apache.org/jira/browse/YARN-2019 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Junping Du >Assignee: Jian He >Priority: Critical > Labels: ha > Attachments: YARN-2019.1-wip.patch, YARN-2019.patch > > > Currently, if any abnormal happens in ZKRMStateStore, it will throw a fetal > exception to crash RM down. As shown in YARN-1924, it could due to RM HA > internal bug itself, but not fatal exception. We should retrospect some > decision here as HA feature is designed to protect key component but not > disturb it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2019) Retrospect on decision of making RM crashed if any exception throw in ZKRMStateStore
[ https://issues.apache.org/jira/browse/YARN-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsuyoshi OZAWA updated YARN-2019: - Attachment: YARN-2019.1-wip.patch > Retrospect on decision of making RM crashed if any exception throw in > ZKRMStateStore > > > Key: YARN-2019 > URL: https://issues.apache.org/jira/browse/YARN-2019 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Junping Du >Priority: Critical > Labels: ha > Attachments: YARN-2019.1-wip.patch > > > Currently, if any abnormal happens in ZKRMStateStore, it will throw a fetal > exception to crash RM down. As shown in YARN-1924, it could due to RM HA > internal bug itself, but not fatal exception. We should retrospect some > decision here as HA feature is designed to protect key component but not > disturb it. -- This message was sent by Atlassian JIRA (v6.2#6252)