[jira] [Commented] (YARN-4209) RMStateStore FENCED state doesn’t work due to updateFencedState called by stateMachine.doTransition
[ https://issues.apache.org/jira/browse/YARN-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15056745#comment-15056745 ] zhihai xu commented on YARN-4209: - This issue won't affect 2.6.x branch, since RMStateStoreState.FENCED state is only added at 2.7.x branch. > RMStateStore FENCED state doesn’t work due to updateFencedState called by > stateMachine.doTransition > --- > > Key: YARN-4209 > URL: https://issues.apache.org/jira/browse/YARN-4209 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.7.2 >Reporter: zhihai xu >Assignee: zhihai xu >Priority: Critical > Fix For: 2.7.2 > > Attachments: YARN-4209.000.patch, YARN-4209.001.patch, > YARN-4209.002.patch, YARN-4209.branch-2.7.patch > > > RMStateStore FENCED state doesn’t work due to {{updateFencedState}} called by > {{stateMachine.doTransition}}. The reason is > {{stateMachine.doTransition}} called from {{updateFencedState}} is embedded > in {{stateMachine.doTransition}} called from public > API(removeRMDelegationToken...) or {{ForwardingEventHandler#handle}}. So > right after the internal state transition from {{updateFencedState}} changes > the state to FENCED state, the external state transition changes the state > back to ACTIVE state. The end result is that RMStateStore is still in ACTIVE > state even after {{notifyStoreOperationFailed}} is called. The only working > case for FENCED state is {{notifyStoreOperationFailed}} called from > {{ZKRMStateStore#VerifyActiveStatusThread}}. > For example: {{removeRMDelegationToken}} => {{handleStoreEvent}} => enter > external {{stateMachine.doTransition}} => {{RemoveRMDTTransition}} => > {{notifyStoreOperationFailed}} > =>{{updateFencedState}}=>{{handleStoreEvent}}=> enter internal > {{stateMachine.doTransition}} => exit internal {{stateMachine.doTransition}} > change state to FENCED => exit external {{stateMachine.doTransition}} change > state to ACTIVE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4209) RMStateStore FENCED state doesn’t work due to updateFencedState called by stateMachine.doTransition
[ https://issues.apache.org/jira/browse/YARN-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15019097#comment-15019097 ] Sangjin Lee commented on YARN-4209: --- Does this issue exist in 2.6.x? Should this be backported to branch-2.6? > RMStateStore FENCED state doesn’t work due to updateFencedState called by > stateMachine.doTransition > --- > > Key: YARN-4209 > URL: https://issues.apache.org/jira/browse/YARN-4209 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.7.2 >Reporter: zhihai xu >Assignee: zhihai xu >Priority: Critical > Fix For: 2.7.2 > > Attachments: YARN-4209.000.patch, YARN-4209.001.patch, > YARN-4209.002.patch, YARN-4209.branch-2.7.patch > > > RMStateStore FENCED state doesn’t work due to {{updateFencedState}} called by > {{stateMachine.doTransition}}. The reason is > {{stateMachine.doTransition}} called from {{updateFencedState}} is embedded > in {{stateMachine.doTransition}} called from public > API(removeRMDelegationToken...) or {{ForwardingEventHandler#handle}}. So > right after the internal state transition from {{updateFencedState}} changes > the state to FENCED state, the external state transition changes the state > back to ACTIVE state. The end result is that RMStateStore is still in ACTIVE > state even after {{notifyStoreOperationFailed}} is called. The only working > case for FENCED state is {{notifyStoreOperationFailed}} called from > {{ZKRMStateStore#VerifyActiveStatusThread}}. > For example: {{removeRMDelegationToken}} => {{handleStoreEvent}} => enter > external {{stateMachine.doTransition}} => {{RemoveRMDTTransition}} => > {{notifyStoreOperationFailed}} > =>{{updateFencedState}}=>{{handleStoreEvent}}=> enter internal > {{stateMachine.doTransition}} => exit internal {{stateMachine.doTransition}} > change state to FENCED => exit external {{stateMachine.doTransition}} change > state to ACTIVE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4209) RMStateStore FENCED state doesn’t work due to updateFencedState called by stateMachine.doTransition
[ https://issues.apache.org/jira/browse/YARN-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946646#comment-14946646 ] Hudson commented on YARN-4209: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #2403 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2403/]) YARN-4209. RMStateStore FENCED state doesnât work due to (rohithsharmaks: rev 9156fc60c654e9305411686878acb443f3be1e67) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestMemoryRMStateStore.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStore.java > RMStateStore FENCED state doesn’t work due to updateFencedState called by > stateMachine.doTransition > --- > > Key: YARN-4209 > URL: https://issues.apache.org/jira/browse/YARN-4209 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.7.2 >Reporter: zhihai xu >Assignee: zhihai xu >Priority: Critical > Fix For: 2.7.2 > > Attachments: YARN-4209.000.patch, YARN-4209.001.patch, > YARN-4209.002.patch, YARN-4209.branch-2.7.patch > > > RMStateStore FENCED state doesn’t work due to {{updateFencedState}} called by > {{stateMachine.doTransition}}. The reason is > {{stateMachine.doTransition}} called from {{updateFencedState}} is embedded > in {{stateMachine.doTransition}} called from public > API(removeRMDelegationToken...) or {{ForwardingEventHandler#handle}}. So > right after the internal state transition from {{updateFencedState}} changes > the state to FENCED state, the external state transition changes the state > back to ACTIVE state. The end result is that RMStateStore is still in ACTIVE > state even after {{notifyStoreOperationFailed}} is called. The only working > case for FENCED state is {{notifyStoreOperationFailed}} called from > {{ZKRMStateStore#VerifyActiveStatusThread}}. > For example: {{removeRMDelegationToken}} => {{handleStoreEvent}} => enter > external {{stateMachine.doTransition}} => {{RemoveRMDTTransition}} => > {{notifyStoreOperationFailed}} > =>{{updateFencedState}}=>{{handleStoreEvent}}=> enter internal > {{stateMachine.doTransition}} => exit internal {{stateMachine.doTransition}} > change state to FENCED => exit external {{stateMachine.doTransition}} change > state to ACTIVE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4209) RMStateStore FENCED state doesn’t work due to updateFencedState called by stateMachine.doTransition
[ https://issues.apache.org/jira/browse/YARN-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946512#comment-14946512 ] Hudson commented on YARN-4209: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #464 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/464/]) YARN-4209. RMStateStore FENCED state doesn’t work due to (rohithsharmaks: rev 9156fc60c654e9305411686878acb443f3be1e67) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStore.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestMemoryRMStateStore.java > RMStateStore FENCED state doesn’t work due to updateFencedState called by > stateMachine.doTransition > --- > > Key: YARN-4209 > URL: https://issues.apache.org/jira/browse/YARN-4209 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.7.2 >Reporter: zhihai xu >Assignee: zhihai xu >Priority: Critical > Fix For: 2.7.2 > > Attachments: YARN-4209.000.patch, YARN-4209.001.patch, > YARN-4209.002.patch, YARN-4209.branch-2.7.patch > > > RMStateStore FENCED state doesn’t work due to {{updateFencedState}} called by > {{stateMachine.doTransition}}. The reason is > {{stateMachine.doTransition}} called from {{updateFencedState}} is embedded > in {{stateMachine.doTransition}} called from public > API(removeRMDelegationToken...) or {{ForwardingEventHandler#handle}}. So > right after the internal state transition from {{updateFencedState}} changes > the state to FENCED state, the external state transition changes the state > back to ACTIVE state. The end result is that RMStateStore is still in ACTIVE > state even after {{notifyStoreOperationFailed}} is called. The only working > case for FENCED state is {{notifyStoreOperationFailed}} called from > {{ZKRMStateStore#VerifyActiveStatusThread}}. > For example: {{removeRMDelegationToken}} => {{handleStoreEvent}} => enter > external {{stateMachine.doTransition}} => {{RemoveRMDTTransition}} => > {{notifyStoreOperationFailed}} > =>{{updateFencedState}}=>{{handleStoreEvent}}=> enter internal > {{stateMachine.doTransition}} => exit internal {{stateMachine.doTransition}} > change state to FENCED => exit external {{stateMachine.doTransition}} change > state to ACTIVE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4209) RMStateStore FENCED state doesn’t work due to updateFencedState called by stateMachine.doTransition
[ https://issues.apache.org/jira/browse/YARN-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946394#comment-14946394 ] Hudson commented on YARN-4209: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2434 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2434/]) YARN-4209. RMStateStore FENCED state doesnât work due to (rohithsharmaks: rev 9156fc60c654e9305411686878acb443f3be1e67) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestMemoryRMStateStore.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStore.java > RMStateStore FENCED state doesn’t work due to updateFencedState called by > stateMachine.doTransition > --- > > Key: YARN-4209 > URL: https://issues.apache.org/jira/browse/YARN-4209 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.7.2 >Reporter: zhihai xu >Assignee: zhihai xu >Priority: Critical > Fix For: 2.7.2 > > Attachments: YARN-4209.000.patch, YARN-4209.001.patch, > YARN-4209.002.patch, YARN-4209.branch-2.7.patch > > > RMStateStore FENCED state doesn’t work due to {{updateFencedState}} called by > {{stateMachine.doTransition}}. The reason is > {{stateMachine.doTransition}} called from {{updateFencedState}} is embedded > in {{stateMachine.doTransition}} called from public > API(removeRMDelegationToken...) or {{ForwardingEventHandler#handle}}. So > right after the internal state transition from {{updateFencedState}} changes > the state to FENCED state, the external state transition changes the state > back to ACTIVE state. The end result is that RMStateStore is still in ACTIVE > state even after {{notifyStoreOperationFailed}} is called. The only working > case for FENCED state is {{notifyStoreOperationFailed}} called from > {{ZKRMStateStore#VerifyActiveStatusThread}}. > For example: {{removeRMDelegationToken}} => {{handleStoreEvent}} => enter > external {{stateMachine.doTransition}} => {{RemoveRMDTTransition}} => > {{notifyStoreOperationFailed}} > =>{{updateFencedState}}=>{{handleStoreEvent}}=> enter internal > {{stateMachine.doTransition}} => exit internal {{stateMachine.doTransition}} > change state to FENCED => exit external {{stateMachine.doTransition}} change > state to ACTIVE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4209) RMStateStore FENCED state doesn’t work due to updateFencedState called by stateMachine.doTransition
[ https://issues.apache.org/jira/browse/YARN-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946390#comment-14946390 ] Hudson commented on YARN-4209: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #1228 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/1228/]) YARN-4209. RMStateStore FENCED state doesnât work due to (rohithsharmaks: rev 9156fc60c654e9305411686878acb443f3be1e67) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStore.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestMemoryRMStateStore.java > RMStateStore FENCED state doesn’t work due to updateFencedState called by > stateMachine.doTransition > --- > > Key: YARN-4209 > URL: https://issues.apache.org/jira/browse/YARN-4209 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.7.2 >Reporter: zhihai xu >Assignee: zhihai xu >Priority: Critical > Fix For: 2.7.2 > > Attachments: YARN-4209.000.patch, YARN-4209.001.patch, > YARN-4209.002.patch, YARN-4209.branch-2.7.patch > > > RMStateStore FENCED state doesn’t work due to {{updateFencedState}} called by > {{stateMachine.doTransition}}. The reason is > {{stateMachine.doTransition}} called from {{updateFencedState}} is embedded > in {{stateMachine.doTransition}} called from public > API(removeRMDelegationToken...) or {{ForwardingEventHandler#handle}}. So > right after the internal state transition from {{updateFencedState}} changes > the state to FENCED state, the external state transition changes the state > back to ACTIVE state. The end result is that RMStateStore is still in ACTIVE > state even after {{notifyStoreOperationFailed}} is called. The only working > case for FENCED state is {{notifyStoreOperationFailed}} called from > {{ZKRMStateStore#VerifyActiveStatusThread}}. > For example: {{removeRMDelegationToken}} => {{handleStoreEvent}} => enter > external {{stateMachine.doTransition}} => {{RemoveRMDTTransition}} => > {{notifyStoreOperationFailed}} > =>{{updateFencedState}}=>{{handleStoreEvent}}=> enter internal > {{stateMachine.doTransition}} => exit internal {{stateMachine.doTransition}} > change state to FENCED => exit external {{stateMachine.doTransition}} change > state to ACTIVE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4209) RMStateStore FENCED state doesn’t work due to updateFencedState called by stateMachine.doTransition
[ https://issues.apache.org/jira/browse/YARN-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946334#comment-14946334 ] Hudson commented on YARN-4209: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #490 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/490/]) YARN-4209. RMStateStore FENCED state doesnât work due to (rohithsharmaks: rev 9156fc60c654e9305411686878acb443f3be1e67) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestMemoryRMStateStore.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStore.java > RMStateStore FENCED state doesn’t work due to updateFencedState called by > stateMachine.doTransition > --- > > Key: YARN-4209 > URL: https://issues.apache.org/jira/browse/YARN-4209 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.7.2 >Reporter: zhihai xu >Assignee: zhihai xu >Priority: Critical > Fix For: 2.7.2 > > Attachments: YARN-4209.000.patch, YARN-4209.001.patch, > YARN-4209.002.patch, YARN-4209.branch-2.7.patch > > > RMStateStore FENCED state doesn’t work due to {{updateFencedState}} called by > {{stateMachine.doTransition}}. The reason is > {{stateMachine.doTransition}} called from {{updateFencedState}} is embedded > in {{stateMachine.doTransition}} called from public > API(removeRMDelegationToken...) or {{ForwardingEventHandler#handle}}. So > right after the internal state transition from {{updateFencedState}} changes > the state to FENCED state, the external state transition changes the state > back to ACTIVE state. The end result is that RMStateStore is still in ACTIVE > state even after {{notifyStoreOperationFailed}} is called. The only working > case for FENCED state is {{notifyStoreOperationFailed}} called from > {{ZKRMStateStore#VerifyActiveStatusThread}}. > For example: {{removeRMDelegationToken}} => {{handleStoreEvent}} => enter > external {{stateMachine.doTransition}} => {{RemoveRMDTTransition}} => > {{notifyStoreOperationFailed}} > =>{{updateFencedState}}=>{{handleStoreEvent}}=> enter internal > {{stateMachine.doTransition}} => exit internal {{stateMachine.doTransition}} > change state to FENCED => exit external {{stateMachine.doTransition}} change > state to ACTIVE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4209) RMStateStore FENCED state doesn’t work due to updateFencedState called by stateMachine.doTransition
[ https://issues.apache.org/jira/browse/YARN-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946240#comment-14946240 ] Hudson commented on YARN-4209: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #498 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/498/]) YARN-4209. RMStateStore FENCED state doesnât work due to (rohithsharmaks: rev 9156fc60c654e9305411686878acb443f3be1e67) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestMemoryRMStateStore.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStore.java * hadoop-yarn-project/CHANGES.txt > RMStateStore FENCED state doesn’t work due to updateFencedState called by > stateMachine.doTransition > --- > > Key: YARN-4209 > URL: https://issues.apache.org/jira/browse/YARN-4209 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.7.2 >Reporter: zhihai xu >Assignee: zhihai xu >Priority: Critical > Fix For: 2.7.2 > > Attachments: YARN-4209.000.patch, YARN-4209.001.patch, > YARN-4209.002.patch, YARN-4209.branch-2.7.patch > > > RMStateStore FENCED state doesn’t work due to {{updateFencedState}} called by > {{stateMachine.doTransition}}. The reason is > {{stateMachine.doTransition}} called from {{updateFencedState}} is embedded > in {{stateMachine.doTransition}} called from public > API(removeRMDelegationToken...) or {{ForwardingEventHandler#handle}}. So > right after the internal state transition from {{updateFencedState}} changes > the state to FENCED state, the external state transition changes the state > back to ACTIVE state. The end result is that RMStateStore is still in ACTIVE > state even after {{notifyStoreOperationFailed}} is called. The only working > case for FENCED state is {{notifyStoreOperationFailed}} called from > {{ZKRMStateStore#VerifyActiveStatusThread}}. > For example: {{removeRMDelegationToken}} => {{handleStoreEvent}} => enter > external {{stateMachine.doTransition}} => {{RemoveRMDTTransition}} => > {{notifyStoreOperationFailed}} > =>{{updateFencedState}}=>{{handleStoreEvent}}=> enter internal > {{stateMachine.doTransition}} => exit internal {{stateMachine.doTransition}} > change state to FENCED => exit external {{stateMachine.doTransition}} change > state to ACTIVE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4209) RMStateStore FENCED state doesn’t work due to updateFencedState called by stateMachine.doTransition
[ https://issues.apache.org/jira/browse/YARN-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946231#comment-14946231 ] Hudson commented on YARN-4209: -- FAILURE: Integrated in Hadoop-trunk-Commit #8581 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/8581/]) YARN-4209. RMStateStore FENCED state doesnât work due to (rohithsharmaks: rev 9156fc60c654e9305411686878acb443f3be1e67) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestMemoryRMStateStore.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStore.java * hadoop-yarn-project/CHANGES.txt > RMStateStore FENCED state doesn’t work due to updateFencedState called by > stateMachine.doTransition > --- > > Key: YARN-4209 > URL: https://issues.apache.org/jira/browse/YARN-4209 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.7.2 >Reporter: zhihai xu >Assignee: zhihai xu >Priority: Critical > Fix For: 2.7.2 > > Attachments: YARN-4209.000.patch, YARN-4209.001.patch, > YARN-4209.002.patch, YARN-4209.branch-2.7.patch > > > RMStateStore FENCED state doesn’t work due to {{updateFencedState}} called by > {{stateMachine.doTransition}}. The reason is > {{stateMachine.doTransition}} called from {{updateFencedState}} is embedded > in {{stateMachine.doTransition}} called from public > API(removeRMDelegationToken...) or {{ForwardingEventHandler#handle}}. So > right after the internal state transition from {{updateFencedState}} changes > the state to FENCED state, the external state transition changes the state > back to ACTIVE state. The end result is that RMStateStore is still in ACTIVE > state even after {{notifyStoreOperationFailed}} is called. The only working > case for FENCED state is {{notifyStoreOperationFailed}} called from > {{ZKRMStateStore#VerifyActiveStatusThread}}. > For example: {{removeRMDelegationToken}} => {{handleStoreEvent}} => enter > external {{stateMachine.doTransition}} => {{RemoveRMDTTransition}} => > {{notifyStoreOperationFailed}} > =>{{updateFencedState}}=>{{handleStoreEvent}}=> enter internal > {{stateMachine.doTransition}} => exit internal {{stateMachine.doTransition}} > change state to FENCED => exit external {{stateMachine.doTransition}} change > state to ACTIVE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4209) RMStateStore FENCED state doesn’t work due to updateFencedState called by stateMachine.doTransition
[ https://issues.apache.org/jira/browse/YARN-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946207#comment-14946207 ] zhihai xu commented on YARN-4209: - Thanks [~rohithsharma] for reviewing and committing the patch! > RMStateStore FENCED state doesn’t work due to updateFencedState called by > stateMachine.doTransition > --- > > Key: YARN-4209 > URL: https://issues.apache.org/jira/browse/YARN-4209 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.7.2 >Reporter: zhihai xu >Assignee: zhihai xu >Priority: Critical > Fix For: 2.7.2 > > Attachments: YARN-4209.000.patch, YARN-4209.001.patch, > YARN-4209.002.patch, YARN-4209.branch-2.7.patch > > > RMStateStore FENCED state doesn’t work due to {{updateFencedState}} called by > {{stateMachine.doTransition}}. The reason is > {{stateMachine.doTransition}} called from {{updateFencedState}} is embedded > in {{stateMachine.doTransition}} called from public > API(removeRMDelegationToken...) or {{ForwardingEventHandler#handle}}. So > right after the internal state transition from {{updateFencedState}} changes > the state to FENCED state, the external state transition changes the state > back to ACTIVE state. The end result is that RMStateStore is still in ACTIVE > state even after {{notifyStoreOperationFailed}} is called. The only working > case for FENCED state is {{notifyStoreOperationFailed}} called from > {{ZKRMStateStore#VerifyActiveStatusThread}}. > For example: {{removeRMDelegationToken}} => {{handleStoreEvent}} => enter > external {{stateMachine.doTransition}} => {{RemoveRMDTTransition}} => > {{notifyStoreOperationFailed}} > =>{{updateFencedState}}=>{{handleStoreEvent}}=> enter internal > {{stateMachine.doTransition}} => exit internal {{stateMachine.doTransition}} > change state to FENCED => exit external {{stateMachine.doTransition}} change > state to ACTIVE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4209) RMStateStore FENCED state doesn’t work due to updateFencedState called by stateMachine.doTransition
[ https://issues.apache.org/jira/browse/YARN-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945575#comment-14945575 ] Hadoop QA commented on YARN-4209: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | patch | 0m 0s | The patch command could not apply the patch during dryrun. | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12765223/YARN-4209.branch-2.7.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | branch-2 / 5453a63 | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/9362/console | This message was automatically generated. > RMStateStore FENCED state doesn’t work due to updateFencedState called by > stateMachine.doTransition > --- > > Key: YARN-4209 > URL: https://issues.apache.org/jira/browse/YARN-4209 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.7.2 >Reporter: zhihai xu >Assignee: zhihai xu >Priority: Critical > Attachments: YARN-4209.000.patch, YARN-4209.001.patch, > YARN-4209.002.patch, YARN-4209.branch-2.7.patch > > > RMStateStore FENCED state doesn’t work due to {{updateFencedState}} called by > {{stateMachine.doTransition}}. The reason is > {{stateMachine.doTransition}} called from {{updateFencedState}} is embedded > in {{stateMachine.doTransition}} called from public > API(removeRMDelegationToken...) or {{ForwardingEventHandler#handle}}. So > right after the internal state transition from {{updateFencedState}} changes > the state to FENCED state, the external state transition changes the state > back to ACTIVE state. The end result is that RMStateStore is still in ACTIVE > state even after {{notifyStoreOperationFailed}} is called. The only working > case for FENCED state is {{notifyStoreOperationFailed}} called from > {{ZKRMStateStore#VerifyActiveStatusThread}}. > For example: {{removeRMDelegationToken}} => {{handleStoreEvent}} => enter > external {{stateMachine.doTransition}} => {{RemoveRMDTTransition}} => > {{notifyStoreOperationFailed}} > =>{{updateFencedState}}=>{{handleStoreEvent}}=> enter internal > {{stateMachine.doTransition}} => exit internal {{stateMachine.doTransition}} > change state to FENCED => exit external {{stateMachine.doTransition}} change > state to ACTIVE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4209) RMStateStore FENCED state doesn’t work due to updateFencedState called by stateMachine.doTransition
[ https://issues.apache.org/jira/browse/YARN-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945553#comment-14945553 ] zhihai xu commented on YARN-4209: - thanks [~rohithsharma]! Yes, I attached the patch YARN-4209.branch-2.7.patch for branch-2.7. > RMStateStore FENCED state doesn’t work due to updateFencedState called by > stateMachine.doTransition > --- > > Key: YARN-4209 > URL: https://issues.apache.org/jira/browse/YARN-4209 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.7.2 >Reporter: zhihai xu >Assignee: zhihai xu >Priority: Critical > Attachments: YARN-4209.000.patch, YARN-4209.001.patch, > YARN-4209.002.patch, YARN-4209.branch-2.7.patch > > > RMStateStore FENCED state doesn’t work due to {{updateFencedState}} called by > {{stateMachine.doTransition}}. The reason is > {{stateMachine.doTransition}} called from {{updateFencedState}} is embedded > in {{stateMachine.doTransition}} called from public > API(removeRMDelegationToken...) or {{ForwardingEventHandler#handle}}. So > right after the internal state transition from {{updateFencedState}} changes > the state to FENCED state, the external state transition changes the state > back to ACTIVE state. The end result is that RMStateStore is still in ACTIVE > state even after {{notifyStoreOperationFailed}} is called. The only working > case for FENCED state is {{notifyStoreOperationFailed}} called from > {{ZKRMStateStore#VerifyActiveStatusThread}}. > For example: {{removeRMDelegationToken}} => {{handleStoreEvent}} => enter > external {{stateMachine.doTransition}} => {{RemoveRMDTTransition}} => > {{notifyStoreOperationFailed}} > =>{{updateFencedState}}=>{{handleStoreEvent}}=> enter internal > {{stateMachine.doTransition}} => exit internal {{stateMachine.doTransition}} > change state to FENCED => exit external {{stateMachine.doTransition}} change > state to ACTIVE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4209) RMStateStore FENCED state doesn’t work due to updateFencedState called by stateMachine.doTransition
[ https://issues.apache.org/jira/browse/YARN-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14944501#comment-14944501 ] Rohith Sharma K S commented on YARN-4209: - patch apply for branch-2.7.2 is failing.. Can you provide patch for branch-2.7? > RMStateStore FENCED state doesn’t work due to updateFencedState called by > stateMachine.doTransition > --- > > Key: YARN-4209 > URL: https://issues.apache.org/jira/browse/YARN-4209 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.7.2 >Reporter: zhihai xu >Assignee: zhihai xu >Priority: Critical > Attachments: YARN-4209.000.patch, YARN-4209.001.patch, > YARN-4209.002.patch > > > RMStateStore FENCED state doesn’t work due to {{updateFencedState}} called by > {{stateMachine.doTransition}}. The reason is > {{stateMachine.doTransition}} called from {{updateFencedState}} is embedded > in {{stateMachine.doTransition}} called from public > API(removeRMDelegationToken...) or {{ForwardingEventHandler#handle}}. So > right after the internal state transition from {{updateFencedState}} changes > the state to FENCED state, the external state transition changes the state > back to ACTIVE state. The end result is that RMStateStore is still in ACTIVE > state even after {{notifyStoreOperationFailed}} is called. The only working > case for FENCED state is {{notifyStoreOperationFailed}} called from > {{ZKRMStateStore#VerifyActiveStatusThread}}. > For example: {{removeRMDelegationToken}} => {{handleStoreEvent}} => enter > external {{stateMachine.doTransition}} => {{RemoveRMDTTransition}} => > {{notifyStoreOperationFailed}} > =>{{updateFencedState}}=>{{handleStoreEvent}}=> enter internal > {{stateMachine.doTransition}} => exit internal {{stateMachine.doTransition}} > change state to FENCED => exit external {{stateMachine.doTransition}} change > state to ACTIVE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4209) RMStateStore FENCED state doesn’t work due to updateFencedState called by stateMachine.doTransition
[ https://issues.apache.org/jira/browse/YARN-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14944498#comment-14944498 ] Rohith Sharma K S commented on YARN-4209: - committing shortly > RMStateStore FENCED state doesn’t work due to updateFencedState called by > stateMachine.doTransition > --- > > Key: YARN-4209 > URL: https://issues.apache.org/jira/browse/YARN-4209 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.7.2 >Reporter: zhihai xu >Assignee: zhihai xu >Priority: Critical > Attachments: YARN-4209.000.patch, YARN-4209.001.patch, > YARN-4209.002.patch > > > RMStateStore FENCED state doesn’t work due to {{updateFencedState}} called by > {{stateMachine.doTransition}}. The reason is > {{stateMachine.doTransition}} called from {{updateFencedState}} is embedded > in {{stateMachine.doTransition}} called from public > API(removeRMDelegationToken...) or {{ForwardingEventHandler#handle}}. So > right after the internal state transition from {{updateFencedState}} changes > the state to FENCED state, the external state transition changes the state > back to ACTIVE state. The end result is that RMStateStore is still in ACTIVE > state even after {{notifyStoreOperationFailed}} is called. The only working > case for FENCED state is {{notifyStoreOperationFailed}} called from > {{ZKRMStateStore#VerifyActiveStatusThread}}. > For example: {{removeRMDelegationToken}} => {{handleStoreEvent}} => enter > external {{stateMachine.doTransition}} => {{RemoveRMDTTransition}} => > {{notifyStoreOperationFailed}} > =>{{updateFencedState}}=>{{handleStoreEvent}}=> enter internal > {{stateMachine.doTransition}} => exit internal {{stateMachine.doTransition}} > change state to FENCED => exit external {{stateMachine.doTransition}} change > state to ACTIVE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4209) RMStateStore FENCED state doesn’t work due to updateFencedState called by stateMachine.doTransition
[ https://issues.apache.org/jira/browse/YARN-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941207#comment-14941207 ] Rohith Sharma K S commented on YARN-4209: - +1 for the latest patch.. I will wait for a day before committing for others to have look at the final patch. > RMStateStore FENCED state doesn’t work due to updateFencedState called by > stateMachine.doTransition > --- > > Key: YARN-4209 > URL: https://issues.apache.org/jira/browse/YARN-4209 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.7.2 >Reporter: zhihai xu >Assignee: zhihai xu >Priority: Critical > Attachments: YARN-4209.000.patch, YARN-4209.001.patch, > YARN-4209.002.patch > > > RMStateStore FENCED state doesn’t work due to {{updateFencedState}} called by > {{stateMachine.doTransition}}. The reason is > {{stateMachine.doTransition}} called from {{updateFencedState}} is embedded > in {{stateMachine.doTransition}} called from public > API(removeRMDelegationToken...) or {{ForwardingEventHandler#handle}}. So > right after the internal state transition from {{updateFencedState}} changes > the state to FENCED state, the external state transition changes the state > back to ACTIVE state. The end result is that RMStateStore is still in ACTIVE > state even after {{notifyStoreOperationFailed}} is called. The only working > case for FENCED state is {{notifyStoreOperationFailed}} called from > {{ZKRMStateStore#VerifyActiveStatusThread}}. > For example: {{removeRMDelegationToken}} => {{handleStoreEvent}} => enter > external {{stateMachine.doTransition}} => {{RemoveRMDTTransition}} => > {{notifyStoreOperationFailed}} > =>{{updateFencedState}}=>{{handleStoreEvent}}=> enter internal > {{stateMachine.doTransition}} => exit internal {{stateMachine.doTransition}} > change state to FENCED => exit external {{stateMachine.doTransition}} change > state to ACTIVE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4209) RMStateStore FENCED state doesn’t work due to updateFencedState called by stateMachine.doTransition
[ https://issues.apache.org/jira/browse/YARN-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14940385#comment-14940385 ] zhihai xu commented on YARN-4209: - The release audit warnings was pre-existing. No other issue from Jenkins test result. > RMStateStore FENCED state doesn’t work due to updateFencedState called by > stateMachine.doTransition > --- > > Key: YARN-4209 > URL: https://issues.apache.org/jira/browse/YARN-4209 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.7.2 >Reporter: zhihai xu >Assignee: zhihai xu >Priority: Critical > Attachments: YARN-4209.000.patch, YARN-4209.001.patch, > YARN-4209.002.patch > > > RMStateStore FENCED state doesn’t work due to {{updateFencedState}} called by > {{stateMachine.doTransition}}. The reason is > {{stateMachine.doTransition}} called from {{updateFencedState}} is embedded > in {{stateMachine.doTransition}} called from public > API(removeRMDelegationToken...) or {{ForwardingEventHandler#handle}}. So > right after the internal state transition from {{updateFencedState}} changes > the state to FENCED state, the external state transition changes the state > back to ACTIVE state. The end result is that RMStateStore is still in ACTIVE > state even after {{notifyStoreOperationFailed}} is called. The only working > case for FENCED state is {{notifyStoreOperationFailed}} called from > {{ZKRMStateStore#VerifyActiveStatusThread}}. > For example: {{removeRMDelegationToken}} => {{handleStoreEvent}} => enter > external {{stateMachine.doTransition}} => {{RemoveRMDTTransition}} => > {{notifyStoreOperationFailed}} > =>{{updateFencedState}}=>{{handleStoreEvent}}=> enter internal > {{stateMachine.doTransition}} => exit internal {{stateMachine.doTransition}} > change state to FENCED => exit external {{stateMachine.doTransition}} change > state to ACTIVE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4209) RMStateStore FENCED state doesn’t work due to updateFencedState called by stateMachine.doTransition
[ https://issues.apache.org/jira/browse/YARN-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14940222#comment-14940222 ] Hadoop QA commented on YARN-4209: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 17m 14s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 1s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 8m 12s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 25s | There were no new javadoc warning messages. | | {color:red}-1{color} | release audit | 0m 16s | The applied patch generated 1 release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 51s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 35s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 1m 31s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 61m 10s | Tests passed in hadoop-yarn-server-resourcemanager. | | | | 101m 53s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12764642/YARN-4209.002.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / ecbfd68 | | Release Audit | https://builds.apache.org/job/PreCommit-YARN-Build/9326/artifact/patchprocess/patchReleaseAuditProblems.txt | | hadoop-yarn-server-resourcemanager test log | https://builds.apache.org/job/PreCommit-YARN-Build/9326/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/9326/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/9326/console | This message was automatically generated. > RMStateStore FENCED state doesn’t work due to updateFencedState called by > stateMachine.doTransition > --- > > Key: YARN-4209 > URL: https://issues.apache.org/jira/browse/YARN-4209 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.7.2 >Reporter: zhihai xu >Assignee: zhihai xu >Priority: Critical > Attachments: YARN-4209.000.patch, YARN-4209.001.patch, > YARN-4209.002.patch > > > RMStateStore FENCED state doesn’t work due to {{updateFencedState}} called by > {{stateMachine.doTransition}}. The reason is > {{stateMachine.doTransition}} called from {{updateFencedState}} is embedded > in {{stateMachine.doTransition}} called from public > API(removeRMDelegationToken...) or {{ForwardingEventHandler#handle}}. So > right after the internal state transition from {{updateFencedState}} changes > the state to FENCED state, the external state transition changes the state > back to ACTIVE state. The end result is that RMStateStore is still in ACTIVE > state even after {{notifyStoreOperationFailed}} is called. The only working > case for FENCED state is {{notifyStoreOperationFailed}} called from > {{ZKRMStateStore#VerifyActiveStatusThread}}. > For example: {{removeRMDelegationToken}} => {{handleStoreEvent}} => enter > external {{stateMachine.doTransition}} => {{RemoveRMDTTransition}} => > {{notifyStoreOperationFailed}} > =>{{updateFencedState}}=>{{handleStoreEvent}}=> enter internal > {{stateMachine.doTransition}} => exit internal {{stateMachine.doTransition}} > change state to FENCED => exit external {{stateMachine.doTransition}} change > state to ACTIVE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4209) RMStateStore FENCED state doesn’t work due to updateFencedState called by stateMachine.doTransition
[ https://issues.apache.org/jira/browse/YARN-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14939524#comment-14939524 ] Hadoop QA commented on YARN-4209: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 17m 15s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 8m 3s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 36s | There were no new javadoc warning messages. | | {color:red}-1{color} | release audit | 0m 14s | The applied patch generated 1 release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 53s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 29s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 1m 27s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:red}-1{color} | yarn tests | 60m 12s | Tests failed in hadoop-yarn-server-resourcemanager. | | | | 100m 49s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.yarn.server.resourcemanager.metrics.TestSystemMetricsPublisher | | | hadoop.yarn.server.resourcemanager.scheduler.fair.TestAllocationFileLoaderService | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12764547/YARN-4209.002.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 5db371f | | Release Audit | https://builds.apache.org/job/PreCommit-YARN-Build/9320/artifact/patchprocess/patchReleaseAuditProblems.txt | | hadoop-yarn-server-resourcemanager test log | https://builds.apache.org/job/PreCommit-YARN-Build/9320/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/9320/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/9320/console | This message was automatically generated. > RMStateStore FENCED state doesn’t work due to updateFencedState called by > stateMachine.doTransition > --- > > Key: YARN-4209 > URL: https://issues.apache.org/jira/browse/YARN-4209 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.7.2 >Reporter: zhihai xu >Assignee: zhihai xu >Priority: Critical > Attachments: YARN-4209.000.patch, YARN-4209.001.patch, > YARN-4209.002.patch > > > RMStateStore FENCED state doesn’t work due to {{updateFencedState}} called by > {{stateMachine.doTransition}}. The reason is > {{stateMachine.doTransition}} called from {{updateFencedState}} is embedded > in {{stateMachine.doTransition}} called from public > API(removeRMDelegationToken...) or {{ForwardingEventHandler#handle}}. So > right after the internal state transition from {{updateFencedState}} changes > the state to FENCED state, the external state transition changes the state > back to ACTIVE state. The end result is that RMStateStore is still in ACTIVE > state even after {{notifyStoreOperationFailed}} is called. The only working > case for FENCED state is {{notifyStoreOperationFailed}} called from > {{ZKRMStateStore#VerifyActiveStatusThread}}. > For example: {{removeRMDelegationToken}} => {{handleStoreEvent}} => enter > external {{stateMachine.doTransition}} => {{RemoveRMDTTransition}} => > {{notifyStoreOperationFailed}} > =>{{updateFencedState}}=>{{handleStoreEvent}}=> enter internal > {{stateMachine.doTransition}} => exit internal {{stateMachine.doTransition}} > change state to FENCED => exit external {{stateMachine.doTransition}} change > state to ACTIVE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4209) RMStateStore FENCED state doesn’t work due to updateFencedState called by stateMachine.doTransition
[ https://issues.apache.org/jira/browse/YARN-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14937959#comment-14937959 ] zhihai xu commented on YARN-4209: - [~rohithsharma], thanks for the review! I uploaded a new patch YARN-4209.002.patch, which addressed all your comments. Please review it. > RMStateStore FENCED state doesn’t work due to updateFencedState called by > stateMachine.doTransition > --- > > Key: YARN-4209 > URL: https://issues.apache.org/jira/browse/YARN-4209 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.7.2 >Reporter: zhihai xu >Assignee: zhihai xu >Priority: Critical > Attachments: YARN-4209.000.patch, YARN-4209.001.patch, > YARN-4209.002.patch > > > RMStateStore FENCED state doesn’t work due to {{updateFencedState}} called by > {{stateMachine.doTransition}}. The reason is > {{stateMachine.doTransition}} called from {{updateFencedState}} is embedded > in {{stateMachine.doTransition}} called from public > API(removeRMDelegationToken...) or {{ForwardingEventHandler#handle}}. So > right after the internal state transition from {{updateFencedState}} changes > the state to FENCED state, the external state transition changes the state > back to ACTIVE state. The end result is that RMStateStore is still in ACTIVE > state even after {{notifyStoreOperationFailed}} is called. The only working > case for FENCED state is {{notifyStoreOperationFailed}} called from > {{ZKRMStateStore#VerifyActiveStatusThread}}. > For example: {{removeRMDelegationToken}} => {{handleStoreEvent}} => enter > external {{stateMachine.doTransition}} => {{RemoveRMDTTransition}} => > {{notifyStoreOperationFailed}} > =>{{updateFencedState}}=>{{handleStoreEvent}}=> enter internal > {{stateMachine.doTransition}} => exit internal {{stateMachine.doTransition}} > change state to FENCED => exit external {{stateMachine.doTransition}} change > state to ACTIVE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4209) RMStateStore FENCED state doesn’t work due to updateFencedState called by stateMachine.doTransition
[ https://issues.apache.org/jira/browse/YARN-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14936752#comment-14936752 ] Hadoop QA commented on YARN-4209: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 16m 30s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:red}-1{color} | javac | 7m 45s | The applied patch generated 1 additional warning messages. | | {color:green}+1{color} | javadoc | 10m 0s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 47s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 27s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 1m 27s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 55m 59s | Tests passed in hadoop-yarn-server-resourcemanager. | | | | 94m 57s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12764394/YARN-4209.001.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 06abc57 | | javac | https://builds.apache.org/job/PreCommit-YARN-Build/9308/artifact/patchprocess/diffJavacWarnings.txt | | hadoop-yarn-server-resourcemanager test log | https://builds.apache.org/job/PreCommit-YARN-Build/9308/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/9308/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/9308/console | This message was automatically generated. > RMStateStore FENCED state doesn’t work due to updateFencedState called by > stateMachine.doTransition > --- > > Key: YARN-4209 > URL: https://issues.apache.org/jira/browse/YARN-4209 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.7.2 >Reporter: zhihai xu >Assignee: zhihai xu >Priority: Critical > Attachments: YARN-4209.000.patch, YARN-4209.001.patch > > > RMStateStore FENCED state doesn’t work due to {{updateFencedState}} called by > {{stateMachine.doTransition}}. The reason is > {{stateMachine.doTransition}} called from {{updateFencedState}} is embedded > in {{stateMachine.doTransition}} called from public > API(removeRMDelegationToken...) or {{ForwardingEventHandler#handle}}. So > right after the internal state transition from {{updateFencedState}} changes > the state to FENCED state, the external state transition changes the state > back to ACTIVE state. The end result is that RMStateStore is still in ACTIVE > state even after {{notifyStoreOperationFailed}} is called. The only working > case for FENCED state is {{notifyStoreOperationFailed}} called from > {{ZKRMStateStore#VerifyActiveStatusThread}}. > For example: {{removeRMDelegationToken}} => {{handleStoreEvent}} => enter > external {{stateMachine.doTransition}} => {{RemoveRMDTTransition}} => > {{notifyStoreOperationFailed}} > =>{{updateFencedState}}=>{{handleStoreEvent}}=> enter internal > {{stateMachine.doTransition}} => exit internal {{stateMachine.doTransition}} > change state to FENCED => exit external {{stateMachine.doTransition}} change > state to ACTIVE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4209) RMStateStore FENCED state doesn’t work due to updateFencedState called by stateMachine.doTransition
[ https://issues.apache.org/jira/browse/YARN-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14936537#comment-14936537 ] Rohith Sharma K S commented on YARN-4209: - Overall patch looks good. Some comments # The below method can be extracted to method? {code} if (isFenced) { return RMStateStoreState.FENCED; } else { return RMStateStoreState.ACTIVE; } {code} # In the method {{notifyStoreOperationFailed}}, I think no need to obtain write lock since {{updateFencedState}} is synchronous call, just before state transition write lock is obtained which is at lower level. Does it really require? > RMStateStore FENCED state doesn’t work due to updateFencedState called by > stateMachine.doTransition > --- > > Key: YARN-4209 > URL: https://issues.apache.org/jira/browse/YARN-4209 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.7.2 >Reporter: zhihai xu >Assignee: zhihai xu >Priority: Critical > Attachments: YARN-4209.000.patch, YARN-4209.001.patch > > > RMStateStore FENCED state doesn’t work due to {{updateFencedState}} called by > {{stateMachine.doTransition}}. The reason is > {{stateMachine.doTransition}} called from {{updateFencedState}} is embedded > in {{stateMachine.doTransition}} called from public > API(removeRMDelegationToken...) or {{ForwardingEventHandler#handle}}. So > right after the internal state transition from {{updateFencedState}} changes > the state to FENCED state, the external state transition changes the state > back to ACTIVE state. The end result is that RMStateStore is still in ACTIVE > state even after {{notifyStoreOperationFailed}} is called. The only working > case for FENCED state is {{notifyStoreOperationFailed}} called from > {{ZKRMStateStore#VerifyActiveStatusThread}}. > For example: {{removeRMDelegationToken}} => {{handleStoreEvent}} => enter > external {{stateMachine.doTransition}} => {{RemoveRMDTTransition}} => > {{notifyStoreOperationFailed}} > =>{{updateFencedState}}=>{{handleStoreEvent}}=> enter internal > {{stateMachine.doTransition}} => exit internal {{stateMachine.doTransition}} > change state to FENCED => exit external {{stateMachine.doTransition}} change > state to ACTIVE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4209) RMStateStore FENCED state doesn’t work due to updateFencedState called by stateMachine.doTransition
[ https://issues.apache.org/jira/browse/YARN-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14936499#comment-14936499 ] zhihai xu commented on YARN-4209: - Hi [~rohithsharma], I uploaded a new patch YARN-4209.001.patch, which uses MultipleArcTransition. Create private function {{notifyStoreOperationFailedInternal}}, now {{notifyStoreOperationFailed}} will only be called by {{ZKRMStateStore#VerifyActiveStatusThread}}. So I acquire {{writeLock}} and check {{isFencedState}} in {{notifyStoreOperationFailed}} to make sure {{handleTransitionToStandBy}} is only called once. Please review it, thanks. > RMStateStore FENCED state doesn’t work due to updateFencedState called by > stateMachine.doTransition > --- > > Key: YARN-4209 > URL: https://issues.apache.org/jira/browse/YARN-4209 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.7.2 >Reporter: zhihai xu >Assignee: zhihai xu >Priority: Critical > Attachments: YARN-4209.000.patch, YARN-4209.001.patch > > > RMStateStore FENCED state doesn’t work due to {{updateFencedState}} called by > {{stateMachine.doTransition}}. The reason is > {{stateMachine.doTransition}} called from {{updateFencedState}} is embedded > in {{stateMachine.doTransition}} called from public > API(removeRMDelegationToken...) or {{ForwardingEventHandler#handle}}. So > right after the internal state transition from {{updateFencedState}} changes > the state to FENCED state, the external state transition changes the state > back to ACTIVE state. The end result is that RMStateStore is still in ACTIVE > state even after {{notifyStoreOperationFailed}} is called. The only working > case for FENCED state is {{notifyStoreOperationFailed}} called from > {{ZKRMStateStore#VerifyActiveStatusThread}}. > For example: {{removeRMDelegationToken}} => {{handleStoreEvent}} => enter > external {{stateMachine.doTransition}} => {{RemoveRMDTTransition}} => > {{notifyStoreOperationFailed}} > =>{{updateFencedState}}=>{{handleStoreEvent}}=> enter internal > {{stateMachine.doTransition}} => exit internal {{stateMachine.doTransition}} > change state to FENCED => exit external {{stateMachine.doTransition}} change > state to ACTIVE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4209) RMStateStore FENCED state doesn’t work due to updateFencedState called by stateMachine.doTransition
[ https://issues.apache.org/jira/browse/YARN-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934695#comment-14934695 ] zhihai xu commented on YARN-4209: - Thanks for the review [~rohithsharma]! Yes, that is a good point! Using MultipleArcTransition will be a better solution. I will implement a new patch using MultipleArcTransition. > RMStateStore FENCED state doesn’t work due to updateFencedState called by > stateMachine.doTransition > --- > > Key: YARN-4209 > URL: https://issues.apache.org/jira/browse/YARN-4209 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.7.2 >Reporter: zhihai xu >Assignee: zhihai xu >Priority: Critical > Attachments: YARN-4209.000.patch > > > RMStateStore FENCED state doesn’t work due to {{updateFencedState}} called by > {{stateMachine.doTransition}}. The reason is > {{stateMachine.doTransition}} called from {{updateFencedState}} is embedded > in {{stateMachine.doTransition}} called from public > API(removeRMDelegationToken...) or {{ForwardingEventHandler#handle}}. So > right after the internal state transition from {{updateFencedState}} changes > the state to FENCED state, the external state transition changes the state > back to ACTIVE state. The end result is that RMStateStore is still in ACTIVE > state even after {{notifyStoreOperationFailed}} is called. The only working > case for FENCED state is {{notifyStoreOperationFailed}} called from > {{ZKRMStateStore#VerifyActiveStatusThread}}. > For example: {{removeRMDelegationToken}} => {{handleStoreEvent}} => enter > external {{stateMachine.doTransition}} => {{RemoveRMDTTransition}} => > {{notifyStoreOperationFailed}} > =>{{updateFencedState}}=>{{handleStoreEvent}}=> enter internal > {{stateMachine.doTransition}} => exit internal {{stateMachine.doTransition}} > change state to FENCED => exit external {{stateMachine.doTransition}} change > state to ACTIVE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4209) RMStateStore FENCED state doesn’t work due to updateFencedState called by stateMachine.doTransition
[ https://issues.apache.org/jira/browse/YARN-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934670#comment-14934670 ] Rohith Sharma K S commented on YARN-4209: - I debugged your test case, and I got your point. Good catch!! About the patch, Moving {{updateFencedState(); }} into StandByTransitionThread would solve the problem, but randomly does not ensure stateMachine is moved to FENCED state since it is asynchronous. It means if any other events are competing for obtaining write lock, then moving to FENCED state might be delayed. Thinking how about changing to MultipleArcTransition? so that all the exception handling return FENCED state. It also ensures state is in FENCED synchronously. > RMStateStore FENCED state doesn’t work due to updateFencedState called by > stateMachine.doTransition > --- > > Key: YARN-4209 > URL: https://issues.apache.org/jira/browse/YARN-4209 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.7.2 >Reporter: zhihai xu >Assignee: zhihai xu >Priority: Critical > Attachments: YARN-4209.000.patch > > > RMStateStore FENCED state doesn’t work due to {{updateFencedState}} called by > {{stateMachine.doTransition}}. The reason is > {{stateMachine.doTransition}} called from {{updateFencedState}} is embedded > in {{stateMachine.doTransition}} called from public > API(removeRMDelegationToken...) or {{ForwardingEventHandler#handle}}. So > right after the internal state transition from {{updateFencedState}} changes > the state to FENCED state, the external state transition changes the state > back to ACTIVE state. The end result is that RMStateStore is still in ACTIVE > state even after {{notifyStoreOperationFailed}} is called. The only working > case for FENCED state is {{notifyStoreOperationFailed}} called from > {{ZKRMStateStore#VerifyActiveStatusThread}}. > For example: {{removeRMDelegationToken}} => {{handleStoreEvent}} => enter > external {{stateMachine.doTransition}} => {{RemoveRMDTTransition}} => > {{notifyStoreOperationFailed}} > =>{{updateFencedState}}=>{{handleStoreEvent}}=> enter internal > {{stateMachine.doTransition}} => exit internal {{stateMachine.doTransition}} > change state to FENCED => exit external {{stateMachine.doTransition}} change > state to ACTIVE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4209) RMStateStore FENCED state doesn’t work due to updateFencedState called by stateMachine.doTransition
[ https://issues.apache.org/jira/browse/YARN-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934605#comment-14934605 ] Rohith Sharma K S commented on YARN-4209: - bq. removeRMDelegationToken => handleStoreEvent => enter external stateMachine.doTransition => RemoveRMDTTransition => notifyStoreOperationFailed =>updateFencedState=>handleStoreEvent=> enter internal stateMachine.doTransition => exit internal *stateMachine.doTransition change state to FENCED => exit external stateMachine.doTransition change state to ACTIVE.* Do you mean transition happened from ACITVE->FENCED and later FENCED->ACTIVE back? If so CMIIW, state machine does not have FENCED->ACTIVE state defined. It should transition FENCED->FENCED. > RMStateStore FENCED state doesn’t work due to updateFencedState called by > stateMachine.doTransition > --- > > Key: YARN-4209 > URL: https://issues.apache.org/jira/browse/YARN-4209 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.7.2 >Reporter: zhihai xu >Assignee: zhihai xu >Priority: Critical > Attachments: YARN-4209.000.patch > > > RMStateStore FENCED state doesn’t work due to {{updateFencedState}} called by > {{stateMachine.doTransition}}. The reason is > {{stateMachine.doTransition}} called from {{updateFencedState}} is embedded > in {{stateMachine.doTransition}} called from public > API(removeRMDelegationToken...) or {{ForwardingEventHandler#handle}}. So > right after the internal state transition from {{updateFencedState}} changes > the state to FENCED state, the external state transition changes the state > back to ACTIVE state. The end result is that RMStateStore is still in ACTIVE > state even after {{notifyStoreOperationFailed}} is called. The only working > case for FENCED state is {{notifyStoreOperationFailed}} called from > {{ZKRMStateStore#VerifyActiveStatusThread}}. > For example: {{removeRMDelegationToken}} => {{handleStoreEvent}} => enter > external {{stateMachine.doTransition}} => {{RemoveRMDTTransition}} => > {{notifyStoreOperationFailed}} > =>{{updateFencedState}}=>{{handleStoreEvent}}=> enter internal > {{stateMachine.doTransition}} => exit internal {{stateMachine.doTransition}} > change state to FENCED => exit external {{stateMachine.doTransition}} change > state to ACTIVE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4209) RMStateStore FENCED state doesn’t work due to updateFencedState called by stateMachine.doTransition
[ https://issues.apache.org/jira/browse/YARN-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934261#comment-14934261 ] zhihai xu commented on YARN-4209: - Hi [~jianhe], Could you help review the patch? I add lock and check {{isFencedState}} in {{StandByTransitionThread}} to make sure {{handleTransitionToStandBy}} and {{updateFencedState}} are only called once to avoid any potential race condition. > RMStateStore FENCED state doesn’t work due to updateFencedState called by > stateMachine.doTransition > --- > > Key: YARN-4209 > URL: https://issues.apache.org/jira/browse/YARN-4209 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.7.2 >Reporter: zhihai xu >Assignee: zhihai xu >Priority: Critical > Attachments: YARN-4209.000.patch > > > RMStateStore FENCED state doesn’t work due to {{updateFencedState}} called by > {{stateMachine.doTransition}}. The reason is > {{stateMachine.doTransition}} called from {{updateFencedState}} is embedded > in {{stateMachine.doTransition}} called from public > API(removeRMDelegationToken...) or {{ForwardingEventHandler#handle}}. So > right after the internal state transition from {{updateFencedState}} changes > the state to FENCED state, the external state transition changes the state > back to ACTIVE state. The end result is that RMStateStore is still in ACTIVE > state even after {{notifyStoreOperationFailed}} is called. The only working > case for FENCED state is {{notifyStoreOperationFailed}} called from > {{ZKRMStateStore#VerifyActiveStatusThread}}. > For example: {{removeRMDelegationToken}} => {{handleStoreEvent}} => enter > external {{stateMachine.doTransition}} => {{RemoveRMDTTransition}} => > {{notifyStoreOperationFailed}} > =>{{updateFencedState}}=>{{handleStoreEvent}}=> enter internal > {{stateMachine.doTransition}} => exit internal {{stateMachine.doTransition}} > change state to FENCED => exit external {{stateMachine.doTransition}} change > state to ACTIVE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4209) RMStateStore FENCED state doesn’t work due to updateFencedState called by stateMachine.doTransition
[ https://issues.apache.org/jira/browse/YARN-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934170#comment-14934170 ] Hadoop QA commented on YARN-4209: - \\ \\ | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 16m 58s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 8m 0s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 12s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 49s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 29s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 36s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 1m 27s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 56m 29s | Tests passed in hadoop-yarn-server-resourcemanager. | | | | 96m 30s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12764079/YARN-4209.000.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 50741cb | | hadoop-yarn-server-resourcemanager test log | https://builds.apache.org/job/PreCommit-YARN-Build/9289/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/9289/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/9289/console | This message was automatically generated. > RMStateStore FENCED state doesn’t work due to updateFencedState called by > stateMachine.doTransition > --- > > Key: YARN-4209 > URL: https://issues.apache.org/jira/browse/YARN-4209 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.7.2 >Reporter: zhihai xu >Assignee: zhihai xu >Priority: Critical > Attachments: YARN-4209.000.patch > > > RMStateStore FENCED state doesn’t work due to {{updateFencedState}} called by > {{stateMachine.doTransition}}. The reason is > {{stateMachine.doTransition}} called from {{updateFencedState}} is embedded > in {{stateMachine.doTransition}} called from public > API(removeRMDelegationToken...) or {{ForwardingEventHandler#handle}}. So > right after the internal state transition from {{updateFencedState}} changes > the state to FENCED state, the external state transition changes the state > back to ACTIVE state. The end result is that RMStateStore is still in ACTIVE > state even after {{notifyStoreOperationFailed}} is called. The only working > case for FENCED state is {{notifyStoreOperationFailed}} called from > {{ZKRMStateStore#VerifyActiveStatusThread}}. > For example: {{removeRMDelegationToken}} => {{handleStoreEvent}} => enter > external {{stateMachine.doTransition}} => {{RemoveRMDTTransition}} => > {{notifyStoreOperationFailed}} > =>{{updateFencedState}}=>{{handleStoreEvent}}=> enter internal > {{stateMachine.doTransition}} => exit internal {{stateMachine.doTransition}} > change state to FENCED => exit external {{stateMachine.doTransition}} change > state to ACTIVE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4209) RMStateStore FENCED state doesn’t work due to updateFencedState called by stateMachine.doTransition
[ https://issues.apache.org/jira/browse/YARN-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14926815#comment-14926815 ] Hadoop QA commented on YARN-4209: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 17m 1s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 8m 4s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 16s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 24s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 0m 48s | The applied patch generated 1 new checkstyle issues (total was 50, now 51). | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 31s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 1m 29s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:red}-1{color} | yarn tests | 55m 54s | Tests failed in hadoop-yarn-server-resourcemanager. | | | | 96m 5s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.yarn.server.resourcemanager.TestApplicationMasterService | | | hadoop.yarn.server.resourcemanager.TestRMAdminService | | | hadoop.yarn.server.resourcemanager.scheduler.TestSchedulerHealth | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12762656/YARN-4209.000.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 66dad85 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/9281/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt | | hadoop-yarn-server-resourcemanager test log | https://builds.apache.org/job/PreCommit-YARN-Build/9281/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/9281/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/9281/console | This message was automatically generated. > RMStateStore FENCED state doesn’t work due to updateFencedState called by > stateMachine.doTransition > --- > > Key: YARN-4209 > URL: https://issues.apache.org/jira/browse/YARN-4209 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.7.2 >Reporter: zhihai xu >Assignee: zhihai xu >Priority: Critical > Attachments: YARN-4209.000.patch > > > RMStateStore FENCED state doesn’t work due to {{updateFencedState}} called by > {{stateMachine.doTransition}}. The reason is > {{stateMachine.doTransition}} called from {{updateFencedState}} is embedded > in {{stateMachine.doTransition}} called from public > API(removeRMDelegationToken...) or {{ForwardingEventHandler#handle}}. So > right after the internal state transition from {{updateFencedState}} changes > the state to FENCED state, the external state transition changes the state > back to ACTIVE state. The end result is that RMStateStore is still in ACTIVE > state even after {{notifyStoreOperationFailed}} is called. The only working > case for FENCED state is {{notifyStoreOperationFailed}} called from > {{ZKRMStateStore#VerifyActiveStatusThread}}. > For example: {{removeRMDelegationToken}} => {{handleStoreEvent}} => enter > external {{stateMachine.doTransition}} => {{RemoveRMDTTransition}} => > {{notifyStoreOperationFailed}} > =>{{updateFencedState}}=>{{handleStoreEvent}}=> enter internal > {{stateMachine.doTransition}} => exit internal {{stateMachine.doTransition}} > change state to FENCED => exit external {{stateMachine.doTransition}} change > state to ACTIVE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4209) RMStateStore FENCED state doesn’t work due to updateFencedState called by stateMachine.doTransition
[ https://issues.apache.org/jira/browse/YARN-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14910141#comment-14910141 ] zhihai xu commented on YARN-4209: - Also the test case in the patch can verify this issue. Without the change, the RMStateStore is still in ACTIVE state even after {{updateFencedState}} is called. > RMStateStore FENCED state doesn’t work due to updateFencedState called by > stateMachine.doTransition > --- > > Key: YARN-4209 > URL: https://issues.apache.org/jira/browse/YARN-4209 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.7.2 >Reporter: zhihai xu >Assignee: zhihai xu >Priority: Critical > Attachments: YARN-4209.000.patch > > > RMStateStore FENCED state doesn’t work due to {{updateFencedState}} called by > {{stateMachine.doTransition}}. The reason is > {{stateMachine.doTransition}} called from {{updateFencedState}} is embedded > in {{stateMachine.doTransition}} called from public > API(removeRMDelegationToken...) or {{ForwardingEventHandler#handle}}. So > right after the internal state transition from {{updateFencedState}} changes > the state to FENCED state, the external state transition changes the state > back to ACTIVE state. The end result is that RMStateStore is still in ACTIVE > state even {{notifyStoreOperationFailed}} is called. The only working case > for FENCED state is {{notifyStoreOperationFailed}} called from > {{ZKRMStateStore#VerifyActiveStatusThread}}. > For example: {{removeRMDelegationToken}} => {{handleStoreEvent}} => enter > external {{stateMachine.doTransition}} => {{RemoveRMDTTransition}} => > {{notifyStoreOperationFailed}} > =>{{updateFencedState}}=>{{handleStoreEvent}}=> enter internal > {{stateMachine.doTransition}} => exit internal {{stateMachine.doTransition}} > change state to FENCED => exit external {{stateMachine.doTransition}} change > state to ACTIVE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4209) RMStateStore FENCED state doesn’t work
[ https://issues.apache.org/jira/browse/YARN-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14910128#comment-14910128 ] zhihai xu commented on YARN-4209: - I attached a patch YARN-4209.000.patch which move {{updateFencedState}} from {{notifyStoreOperationFailed}} to {{StandByTransitionThread}}. So {{updateFencedState}} won't be called by {{stateMachine.doTransition}}. > RMStateStore FENCED state doesn’t work > -- > > Key: YARN-4209 > URL: https://issues.apache.org/jira/browse/YARN-4209 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.7.2 >Reporter: zhihai xu >Assignee: zhihai xu >Priority: Critical > Attachments: YARN-4209.000.patch > > > RMStateStore FENCED state doesn’t work. The reason is > {{stateMachine.doTransition}} called from {{updateFencedState}} is embedded > in {{stateMachine.doTransition}} called from public > API(removeRMDelegationToken...) or {{ForwardingEventHandler#handle}}. So > right after the internal state transition from {{updateFencedState}} changes > the state to FENCED state, the external state transition changes the state > back to ACTIVE state. The end result is that RMStateStore is still in ACTIVE > state even notifyStoreOperationFailed is called. The only working case for > FENCED state is {{notifyStoreOperationFailed}} called from > {{ZKRMStateStore#VerifyActiveStatusThread}}. > For example: {{removeRMDelegationToken}} => {{handleStoreEvent}} => enter > external {{stateMachine.doTransition}} => {{RemoveRMDTTransition}} => > {{notifyStoreOperationFailed}} > =>{{updateFencedState}}=>{{handleStoreEvent}}=> enter internal > {{stateMachine.doTransition}} => exit internal {{stateMachine.doTransition}} > change state to FENCED => exit external {{stateMachine.doTransition}} change > state to ACTIVE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)