[jira] [Commented] (YARN-1878) Yarn standby RM taking long to transition to active
[ https://issues.apache.org/jira/browse/YARN-1878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14528934#comment-14528934 ] Xuan Gong commented on YARN-1878: - Cancel the patch. Looks like that we need more discussion on this one Yarn standby RM taking long to transition to active --- Key: YARN-1878 URL: https://issues.apache.org/jira/browse/YARN-1878 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Xuan Gong Attachments: YARN-1878.1.patch In our HA tests we are noticing that some times it can take upto 10s for the standby RM to transition to active. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1878) Yarn standby RM taking long to transition to active
[ https://issues.apache.org/jira/browse/YARN-1878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525527#comment-14525527 ] Hadoop QA commented on YARN-1878: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 36s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | | {color:green}+1{color} | javac | 7m 31s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 31s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 1m 52s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 34s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 42s | The patch does not introduce any new Findbugs (version 2.0.3) warnings. | | {color:green}+1{color} | yarn tests | 0m 22s | Tests passed in hadoop-yarn-api. | | {color:green}+1{color} | yarn tests | 1m 56s | Tests passed in hadoop-yarn-common. | | | | 41m 4s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12636856/YARN-1878.1.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 6ae2a0d | | hadoop-yarn-api test log | https://builds.apache.org/job/PreCommit-YARN-Build/7662/artifact/patchprocess/testrun_hadoop-yarn-api.txt | | hadoop-yarn-common test log | https://builds.apache.org/job/PreCommit-YARN-Build/7662/artifact/patchprocess/testrun_hadoop-yarn-common.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/7662/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/7662/console | This message was automatically generated. Yarn standby RM taking long to transition to active --- Key: YARN-1878 URL: https://issues.apache.org/jira/browse/YARN-1878 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Xuan Gong Attachments: YARN-1878.1.patch In our HA tests we are noticing that some times it can take upto 10s for the standby RM to transition to active. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1878) Yarn standby RM taking long to transition to active
[ https://issues.apache.org/jira/browse/YARN-1878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13961319#comment-13961319 ] Karthik Kambatla commented on YARN-1878: The patch might be safe, but I don't think it is a good idea to get this in without running some perf tests. Also, I don't think it needs to be a blocker. All the patch does is reduce the timeout, users should be able to do that by themselves. Yarn standby RM taking long to transition to active --- Key: YARN-1878 URL: https://issues.apache.org/jira/browse/YARN-1878 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Xuan Gong Priority: Blocker Attachments: YARN-1878.1.patch In our HA tests we are noticing that some times it can take upto 10s for the standby RM to transition to active. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1878) Yarn standby RM taking long to transition to active
[ https://issues.apache.org/jira/browse/YARN-1878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960929#comment-13960929 ] Arun C Murthy commented on YARN-1878: - [~xgong] is this ready to go? Let's get this into 2.4.1. Tx. Yarn standby RM taking long to transition to active --- Key: YARN-1878 URL: https://issues.apache.org/jira/browse/YARN-1878 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Xuan Gong Attachments: YARN-1878.1.patch In our HA tests we are noticing that some times it can take upto 10s for the standby RM to transition to active. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1878) Yarn standby RM taking long to transition to active
[ https://issues.apache.org/jira/browse/YARN-1878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13947585#comment-13947585 ] Tsuyoshi OZAWA commented on YARN-1878: -- I got it. I confirmed code, and LGTM. +1 for the patch. Yarn standby RM taking long to transition to active --- Key: YARN-1878 URL: https://issues.apache.org/jira/browse/YARN-1878 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Xuan Gong Attachments: YARN-1878.1.patch In our HA tests we are noticing that some times it can take upto 10s for the standby RM to transition to active. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1878) Yarn standby RM taking long to transition to active
[ https://issues.apache.org/jira/browse/YARN-1878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13947615#comment-13947615 ] Karthik Kambatla commented on YARN-1878: Please note that reducing yarn.resourcemanager.zk-timeout-ms reduces the timeout for ZK-store session as well. I am not sure going as low as 5 seconds is acceptable for store operations. [~xgong] - would it be possible to run a reasonably heavy workload (1000s of apps) with 5s default, and see if that causes any issues? Otherwise, I could set one of our RM HA clusters to use 5s timeout and see what happens. Yarn standby RM taking long to transition to active --- Key: YARN-1878 URL: https://issues.apache.org/jira/browse/YARN-1878 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Xuan Gong Attachments: YARN-1878.1.patch In our HA tests we are noticing that some times it can take upto 10s for the standby RM to transition to active. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1878) Yarn standby RM taking long to transition to active
[ https://issues.apache.org/jira/browse/YARN-1878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13947513#comment-13947513 ] Arpit Gupta commented on YARN-1878: --- In comparison HDFS Namenode usually can transition much faster than 10s. Yarn standby RM taking long to transition to active --- Key: YARN-1878 URL: https://issues.apache.org/jira/browse/YARN-1878 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Xuan Gong In our HA tests we are noticing that some times it can take upto 10s for the standby RM to transition to active. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1878) Yarn standby RM taking long to transition to active
[ https://issues.apache.org/jira/browse/YARN-1878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13947528#comment-13947528 ] Xuan Gong commented on YARN-1878: - In ActiveStandbyElector#process, it waits zkSessionTimeout (The default value is 10s) to process the event. I think this is the reason why it takes so long for standby RM transits to active. To address this issue, we can simply change the default value to 5s, which is used in HDFS. Also, we can adjust this config by setting yarn.resourcemanager.zk-timeout-ms in YARN-SITE.XML Yarn standby RM taking long to transition to active --- Key: YARN-1878 URL: https://issues.apache.org/jira/browse/YARN-1878 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Xuan Gong In our HA tests we are noticing that some times it can take upto 10s for the standby RM to transition to active. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1878) Yarn standby RM taking long to transition to active
[ https://issues.apache.org/jira/browse/YARN-1878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13947543#comment-13947543 ] Hadoop QA commented on YARN-1878: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12636856/YARN-1878.1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3461//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3461//console This message is automatically generated. Yarn standby RM taking long to transition to active --- Key: YARN-1878 URL: https://issues.apache.org/jira/browse/YARN-1878 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Xuan Gong Attachments: YARN-1878.1.patch In our HA tests we are noticing that some times it can take upto 10s for the standby RM to transition to active. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1878) Yarn standby RM taking long to transition to active
[ https://issues.apache.org/jira/browse/YARN-1878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13947550#comment-13947550 ] Jian He commented on YARN-1878: --- Xuan, did you verify in real cluster that this transition does get faster after making this change? Yarn standby RM taking long to transition to active --- Key: YARN-1878 URL: https://issues.apache.org/jira/browse/YARN-1878 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Xuan Gong Attachments: YARN-1878.1.patch In our HA tests we are noticing that some times it can take upto 10s for the standby RM to transition to active. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1878) Yarn standby RM taking long to transition to active
[ https://issues.apache.org/jira/browse/YARN-1878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13947558#comment-13947558 ] Tsuyoshi OZAWA commented on YARN-1878: -- Xuan, I'd like to clarify one point: do you intend to speedup ProtocolHATestBase? Yarn standby RM taking long to transition to active --- Key: YARN-1878 URL: https://issues.apache.org/jira/browse/YARN-1878 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Xuan Gong Attachments: YARN-1878.1.patch In our HA tests we are noticing that some times it can take upto 10s for the standby RM to transition to active. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1878) Yarn standby RM taking long to transition to active
[ https://issues.apache.org/jira/browse/YARN-1878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13947568#comment-13947568 ] Xuan Gong commented on YARN-1878: - bq. Xuan, did you verify in real cluster that this transition does get faster after making this change? Yes, verified Yarn standby RM taking long to transition to active --- Key: YARN-1878 URL: https://issues.apache.org/jira/browse/YARN-1878 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Xuan Gong Attachments: YARN-1878.1.patch In our HA tests we are noticing that some times it can take upto 10s for the standby RM to transition to active. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1878) Yarn standby RM taking long to transition to active
[ https://issues.apache.org/jira/browse/YARN-1878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13947569#comment-13947569 ] Xuan Gong commented on YARN-1878: - bq. Xuan, I'd like to clarify one point: do you intend to speedup ProtocolHATestBase? Sorry, I do not get this. This is not for the testcase. It happens in real cluster Yarn standby RM taking long to transition to active --- Key: YARN-1878 URL: https://issues.apache.org/jira/browse/YARN-1878 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Xuan Gong Attachments: YARN-1878.1.patch In our HA tests we are noticing that some times it can take upto 10s for the standby RM to transition to active. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1878) Yarn standby RM taking long to transition to active
[ https://issues.apache.org/jira/browse/YARN-1878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13947575#comment-13947575 ] Fengdong Yu commented on YARN-1878: --- +1 for the patch. HDFS failover is also 5s by default. Yarn standby RM taking long to transition to active --- Key: YARN-1878 URL: https://issues.apache.org/jira/browse/YARN-1878 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Xuan Gong Attachments: YARN-1878.1.patch In our HA tests we are noticing that some times it can take upto 10s for the standby RM to transition to active. -- This message was sent by Atlassian JIRA (v6.2#6252)