[jira] [Updated] (YARN-3705) forcemanual transitionToStandby in RM-HA automatic-failover mode should change elector state
[ https://issues.apache.org/jira/browse/YARN-3705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated YARN-3705: --- Attachment: YARN-3705.006.patch bq. If we call resetLeaderElection inside the rmadmin.transitionToStandby(), it will cause a infinite loop. You are right. I need to make sure that resetLeaderElection is not called when EmbeddedElectorService#becomeStandby calls transitionToStandy. Thanks for the good catch, [~xgong]. I attached 006. Though I checked that the loop is not caused by starting RM-HA manually with patched jar, it is difficult to test that in unit test. forcemanual transitionToStandby in RM-HA automatic-failover mode should change elector state Key: YARN-3705 URL: https://issues.apache.org/jira/browse/YARN-3705 Project: Hadoop YARN Issue Type: Sub-task Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Attachments: YARN-3705.001.patch, YARN-3705.002.patch, YARN-3705.003.patch, YARN-3705.004.patch, YARN-3705.005.patch, YARN-3705.006.patch Executing {{rmadmin -transitionToStandby --forcemanual}} in automatic-failover.enabled mode makes ResouceManager standby while keeping the state of ActiveStandbyElector. It should make elector to quit and rejoin in order to enable other candidates to promote, otherwise forcemanual transition should not be allowed in automatic-failover mode in order to avoid confusion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3705) forcemanual transitionToStandby in RM-HA automatic-failover mode should change elector state
[ https://issues.apache.org/jira/browse/YARN-3705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated YARN-3705: --- Attachment: YARN-3705.005.patch findbugs warnings is about synchronization of {{autoFailoverEnabled}} variable. {noformat} M M IS: Inconsistent synchronization of org.apache.hadoop.yarn.server.resourcemanager.AdminService.autoFailoverEnabled; locked 66% of time Unsynchronized access at AdminService.java:[line 129] {noformat} I removed the checking this variable in 005 because equivalent checking is done in AdminService#resetLeaderElection. forcemanual transitionToStandby in RM-HA automatic-failover mode should change elector state Key: YARN-3705 URL: https://issues.apache.org/jira/browse/YARN-3705 Project: Hadoop YARN Issue Type: Sub-task Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Attachments: YARN-3705.001.patch, YARN-3705.002.patch, YARN-3705.003.patch, YARN-3705.004.patch, YARN-3705.005.patch Executing {{rmadmin -transitionToStandby --forcemanual}} in automatic-failover.enabled mode makes ResouceManager standby while keeping the state of ActiveStandbyElector. It should make elector to quit and rejoin in order to enable other candidates to promote, otherwise forcemanual transition should not be allowed in automatic-failover mode in order to avoid confusion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3705) forcemanual transitionToStandby in RM-HA automatic-failover mode should change elector state
[ https://issues.apache.org/jira/browse/YARN-3705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated YARN-3705: --- Attachment: YARN-3705.004.patch I should not have used ResourceManager#handleTransitionToStandBy due to the way of error handling. {code} } catch (Exception e) { LOG.fatal(Failed to transition RM to Standby mode.); ExitUtil.terminate(1, e); {code} 004 is just using AdminService#resetLeaderElection. TestWorkPreservingRMRestart and TestRMRestart are succeeded on my local environment. forcemanual transitionToStandby in RM-HA automatic-failover mode should change elector state Key: YARN-3705 URL: https://issues.apache.org/jira/browse/YARN-3705 Project: Hadoop YARN Issue Type: Sub-task Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Attachments: YARN-3705.001.patch, YARN-3705.002.patch, YARN-3705.003.patch, YARN-3705.004.patch Executing {{rmadmin -transitionToStandby --forcemanual}} in automatic-failover.enabled mode makes ResouceManager standby while keeping the state of ActiveStandbyElector. It should make elector to quit and rejoin in order to enable other candidates to promote, otherwise forcemanual transition should not be allowed in automatic-failover mode in order to avoid confusion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3705) forcemanual transitionToStandby in RM-HA automatic-failover mode should change elector state
[ https://issues.apache.org/jira/browse/YARN-3705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated YARN-3705: --- Attachment: YARN-3705.002.patch I'm attached 002 addressing whitespace warnings. TestWorkPreservingRMRestart is not related to the code path the patch fixes. forcemanual transitionToStandby in RM-HA automatic-failover mode should change elector state Key: YARN-3705 URL: https://issues.apache.org/jira/browse/YARN-3705 Project: Hadoop YARN Issue Type: Sub-task Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Attachments: YARN-3705.001.patch, YARN-3705.002.patch Executing {{rmadmin -transitionToStandby --forcemanual}} in automatic-failover.enabled mode makes ResouceManager standby while keeping the state of ActiveStandbyElector. It should make elector to quit and rejoin in order to enable other candidates to promote, otherwise forcemanual transition should not be allowed in automatic-failover mode in order to avoid confusion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3705) forcemanual transitionToStandby in RM-HA automatic-failover mode should change elector state
[ https://issues.apache.org/jira/browse/YARN-3705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated YARN-3705: --- Attachment: YARN-3705.003.patch The test failure is relevant. ResourceManager#handleTransitionToStandBy is expected to be used only when automatic failover enabled. I am attaching 003 addressing non automatic failover case too. forcemanual transitionToStandby in RM-HA automatic-failover mode should change elector state Key: YARN-3705 URL: https://issues.apache.org/jira/browse/YARN-3705 Project: Hadoop YARN Issue Type: Sub-task Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Attachments: YARN-3705.001.patch, YARN-3705.002.patch, YARN-3705.003.patch Executing {{rmadmin -transitionToStandby --forcemanual}} in automatic-failover.enabled mode makes ResouceManager standby while keeping the state of ActiveStandbyElector. It should make elector to quit and rejoin in order to enable other candidates to promote, otherwise forcemanual transition should not be allowed in automatic-failover mode in order to avoid confusion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3705) forcemanual transitionToStandby in RM-HA automatic-failover mode should change elector state
[ https://issues.apache.org/jira/browse/YARN-3705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated YARN-3705: --- Attachment: YARN-3705.001.patch I attached 001 which apply quick fix of the issue. I need more consideration to add monitoring by HealthMonitor to AdminService. If it turns out to be worth adding, I will file a follow-up JIRA. forcemanual transitionToStandby in RM-HA automatic-failover mode should change elector state Key: YARN-3705 URL: https://issues.apache.org/jira/browse/YARN-3705 Project: Hadoop YARN Issue Type: Sub-task Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Attachments: YARN-3705.001.patch Executing {{rmadmin -transitionToStandby --forcemanual}} in automatic-failover.enabled mode makes ResouceManager standby while keeping the state of ActiveStandbyElector. It should make elector to quit and rejoin in order to enable other candidates to promote, otherwise forcemanual transition should not be allowed in automatic-failover mode in order to avoid confusion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3705) forcemanual transitionToStandby in RM-HA automatic-failover mode should change elector state
[ https://issues.apache.org/jira/browse/YARN-3705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated YARN-3705: --- Description: Executing {{rmadmin -transitionToStandby --forcemanual}} in automatic-failover.enabled mode makes ResouceManager standby while keeping the state of ActiveStandbyElector. It should make elector to quit and rejoin in order to enable other candidates to promote, otherwise forcemanual transition should not be allowed in automatic-failover mode in order to avoid confusion. (was: Executing {{rmadmin -transitionToActive --forcemanual}} and {{rmadmin -transitionToActive --forcemanual}} in automatic-failover.enabled mode changes the active/standby state of ResouceManager while keeping the state of ActiveStandbyElector. It should make elector to quit and rejoin otherwise forcemanual transition should not be allowed in automatic-failover mode in order to avoid confusion.) forcemanual transitionToStandby in RM-HA automatic-failover mode should change elector state Key: YARN-3705 URL: https://issues.apache.org/jira/browse/YARN-3705 Project: Hadoop YARN Issue Type: Sub-task Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Executing {{rmadmin -transitionToStandby --forcemanual}} in automatic-failover.enabled mode makes ResouceManager standby while keeping the state of ActiveStandbyElector. It should make elector to quit and rejoin in order to enable other candidates to promote, otherwise forcemanual transition should not be allowed in automatic-failover mode in order to avoid confusion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3705) forcemanual transitionToStandby in RM-HA automatic-failover mode should change elector state
[ https://issues.apache.org/jira/browse/YARN-3705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated YARN-3705: --- Summary: forcemanual transitionToStandby in RM-HA automatic-failover mode should change elector state (was: forcemanual transition of RM active/standby state in automatic-failover mode should change elector state) forcemanual transitionToStandby in RM-HA automatic-failover mode should change elector state Key: YARN-3705 URL: https://issues.apache.org/jira/browse/YARN-3705 Project: Hadoop YARN Issue Type: Sub-task Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Executing {{rmadmin -transitionToActive --forcemanual}} and {{rmadmin -transitionToActive --forcemanual}} in automatic-failover.enabled mode changes the active/standby state of ResouceManager while keeping the state of ActiveStandbyElector. It should make elector to quit and rejoin otherwise forcemanual transition should not be allowed in automatic-failover mode in order to avoid confusion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)