[jira] [Commented] (HBASE-5970) Improve the AssignmentManager#updateTimer and speed up handling opened event
[ https://issues.apache.org/jira/browse/HBASE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13490355#comment-13490355 ] Hudson commented on HBASE-5970: --- Integrated in HBase-0.94-security-on-Hadoop-23 #9 (See [https://builds.apache.org/job/HBase-0.94-security-on-Hadoop-23/9/]) HBASE-7038 Port HBASE-5970 Improve the AssignmentManager#updateTimer and speed up handling opened event to 0.94 (Sergey Shelukhin) (Revision 1403787) Result = FAILURE tedyu : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java Improve the AssignmentManager#updateTimer and speed up handling opened event Key: HBASE-5970 URL: https://issues.apache.org/jira/browse/HBASE-5970 Project: HBase Issue Type: Improvement Components: master Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.96.0 Attachments: 5970v3.patch, HBASE-5970.patch, HBASE-5970v2.patch, HBASE-5970v3.patch, HBASE-5970v4.patch, HBASE-5970v4.patch We found handing opened event very slow in the environment with lots of regions. The problem is the slow AssignmentManager#updateTimer. We do the test for bulk assigning 10w (i.e. 100k) regions, the whole process of bulk assigning took 1 hours. 2012-05-06 20:31:49,201 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) round-robin across 5 server(s) 2012-05-06 21:26:32,103 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done I think we could do the improvement for the AssignmentManager#updateTimer: Make a thread do this work. After the improvement, it took only 4.5mins 2012-05-07 11:03:36,581 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) across 5 server(s), retainAssignment=true 2012-05-07 11:07:57,073 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5970) Improve the AssignmentManager#updateTimer and speed up handling opened event
[ https://issues.apache.org/jira/browse/HBASE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13490414#comment-13490414 ] chunhui shen commented on HBASE-5970: - [~yangming] Please see the HBASE-7018, you know why your cluster handle opened event slow. In my testing, the cluster doesn't have the problem by HBASE-7018, so its bottleneck is on the master. Also, please increase the open region threads on the regionserver. If you also have problem, you could email me directly... Improve the AssignmentManager#updateTimer and speed up handling opened event Key: HBASE-5970 URL: https://issues.apache.org/jira/browse/HBASE-5970 Project: HBase Issue Type: Improvement Components: master Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.96.0 Attachments: 5970v3.patch, HBASE-5970.patch, HBASE-5970v2.patch, HBASE-5970v3.patch, HBASE-5970v4.patch, HBASE-5970v4.patch We found handing opened event very slow in the environment with lots of regions. The problem is the slow AssignmentManager#updateTimer. We do the test for bulk assigning 10w (i.e. 100k) regions, the whole process of bulk assigning took 1 hours. 2012-05-06 20:31:49,201 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) round-robin across 5 server(s) 2012-05-06 21:26:32,103 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done I think we could do the improvement for the AssignmentManager#updateTimer: Make a thread do this work. After the improvement, it took only 4.5mins 2012-05-07 11:03:36,581 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) across 5 server(s), retainAssignment=true 2012-05-07 11:07:57,073 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5970) Improve the AssignmentManager#updateTimer and speed up handling opened event
[ https://issues.apache.org/jira/browse/HBASE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13490420#comment-13490420 ] yang ming commented on HBASE-5970: -- [~zjushch] ok,thanks very much! I will try! Improve the AssignmentManager#updateTimer and speed up handling opened event Key: HBASE-5970 URL: https://issues.apache.org/jira/browse/HBASE-5970 Project: HBase Issue Type: Improvement Components: master Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.96.0 Attachments: 5970v3.patch, HBASE-5970.patch, HBASE-5970v2.patch, HBASE-5970v3.patch, HBASE-5970v4.patch, HBASE-5970v4.patch We found handing opened event very slow in the environment with lots of regions. The problem is the slow AssignmentManager#updateTimer. We do the test for bulk assigning 10w (i.e. 100k) regions, the whole process of bulk assigning took 1 hours. 2012-05-06 20:31:49,201 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) round-robin across 5 server(s) 2012-05-06 21:26:32,103 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done I think we could do the improvement for the AssignmentManager#updateTimer: Make a thread do this work. After the improvement, it took only 4.5mins 2012-05-07 11:03:36,581 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) across 5 server(s), retainAssignment=true 2012-05-07 11:07:57,073 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5970) Improve the AssignmentManager#updateTimer and speed up handling opened event
[ https://issues.apache.org/jira/browse/HBASE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13489944#comment-13489944 ] yang ming commented on HBASE-5970: -- [~zjushch] I have tried this patch on 0.94.2 with 100,000 regions(one empty table) and 4RS.I restarted the cluster,but found handing opened event still very slow. I did not see any improvement,can you show us your test environment? Thanks. Improve the AssignmentManager#updateTimer and speed up handling opened event Key: HBASE-5970 URL: https://issues.apache.org/jira/browse/HBASE-5970 Project: HBase Issue Type: Improvement Components: master Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.96.0 Attachments: 5970v3.patch, HBASE-5970.patch, HBASE-5970v2.patch, HBASE-5970v3.patch, HBASE-5970v4.patch, HBASE-5970v4.patch We found handing opened event very slow in the environment with lots of regions. The problem is the slow AssignmentManager#updateTimer. We do the test for bulk assigning 10w (i.e. 100k) regions, the whole process of bulk assigning took 1 hours. 2012-05-06 20:31:49,201 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) round-robin across 5 server(s) 2012-05-06 21:26:32,103 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done I think we could do the improvement for the AssignmentManager#updateTimer: Make a thread do this work. After the improvement, it took only 4.5mins 2012-05-07 11:03:36,581 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) across 5 server(s), retainAssignment=true 2012-05-07 11:07:57,073 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5970) Improve the AssignmentManager#updateTimer and speed up handling opened event
[ https://issues.apache.org/jira/browse/HBASE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13489950#comment-13489950 ] yang ming commented on HBASE-5970: -- [~zjushch] Or does this path depend on other modification? Thanks. Improve the AssignmentManager#updateTimer and speed up handling opened event Key: HBASE-5970 URL: https://issues.apache.org/jira/browse/HBASE-5970 Project: HBase Issue Type: Improvement Components: master Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.96.0 Attachments: 5970v3.patch, HBASE-5970.patch, HBASE-5970v2.patch, HBASE-5970v3.patch, HBASE-5970v4.patch, HBASE-5970v4.patch We found handing opened event very slow in the environment with lots of regions. The problem is the slow AssignmentManager#updateTimer. We do the test for bulk assigning 10w (i.e. 100k) regions, the whole process of bulk assigning took 1 hours. 2012-05-06 20:31:49,201 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) round-robin across 5 server(s) 2012-05-06 21:26:32,103 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done I think we could do the improvement for the AssignmentManager#updateTimer: Make a thread do this work. After the improvement, it took only 4.5mins 2012-05-07 11:03:36,581 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) across 5 server(s), retainAssignment=true 2012-05-07 11:07:57,073 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5970) Improve the AssignmentManager#updateTimer and speed up handling opened event
[ https://issues.apache.org/jira/browse/HBASE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13487271#comment-13487271 ] Hudson commented on HBASE-5970: --- Integrated in HBase-0.94 #560 (See [https://builds.apache.org/job/HBase-0.94/560/]) HBASE-7038 Port HBASE-5970 Improve the AssignmentManager#updateTimer and speed up handling opened event to 0.94 (Sergey Shelukhin) (Revision 1403787) Result = SUCCESS tedyu : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java Improve the AssignmentManager#updateTimer and speed up handling opened event Key: HBASE-5970 URL: https://issues.apache.org/jira/browse/HBASE-5970 Project: HBase Issue Type: Improvement Components: master Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.96.0 Attachments: 5970v3.patch, HBASE-5970.patch, HBASE-5970v2.patch, HBASE-5970v3.patch, HBASE-5970v4.patch, HBASE-5970v4.patch We found handing opened event very slow in the environment with lots of regions. The problem is the slow AssignmentManager#updateTimer. We do the test for bulk assigning 10w (i.e. 100k) regions, the whole process of bulk assigning took 1 hours. 2012-05-06 20:31:49,201 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) round-robin across 5 server(s) 2012-05-06 21:26:32,103 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done I think we could do the improvement for the AssignmentManager#updateTimer: Make a thread do this work. After the improvement, it took only 4.5mins 2012-05-07 11:03:36,581 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) across 5 server(s), retainAssignment=true 2012-05-07 11:07:57,073 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5970) Improve the AssignmentManager#updateTimer and speed up handling opened event
[ https://issues.apache.org/jira/browse/HBASE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293796#comment-13293796 ] ramkrishna.s.vasudevan commented on HBASE-5970: --- @Chunhui I have a question here. We tried this patch on 0.94 with 2 regions and 4 RS. The scenario we tried was to disable and enable a table that had 2 regions. We did not see much improvement. Do you see any specific scenario where we can get improvement? Thanks. Improve the AssignmentManager#updateTimer and speed up handling opened event Key: HBASE-5970 URL: https://issues.apache.org/jira/browse/HBASE-5970 Project: HBase Issue Type: Improvement Components: master Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.96.0 Attachments: 5970v3.patch, HBASE-5970.patch, HBASE-5970v2.patch, HBASE-5970v3.patch, HBASE-5970v4.patch, HBASE-5970v4.patch We found handing opened event very slow in the environment with lots of regions. The problem is the slow AssignmentManager#updateTimer. We do the test for bulk assigning 10w (i.e. 100k) regions, the whole process of bulk assigning took 1 hours. 2012-05-06 20:31:49,201 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) round-robin across 5 server(s) 2012-05-06 21:26:32,103 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done I think we could do the improvement for the AssignmentManager#updateTimer: Make a thread do this work. After the improvement, it took only 4.5mins 2012-05-07 11:03:36,581 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) across 5 server(s), retainAssignment=true 2012-05-07 11:07:57,073 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5970) Improve the AssignmentManager#updateTimer and speed up handling opened event
[ https://issues.apache.org/jira/browse/HBASE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13294063#comment-13294063 ] chunhui shen commented on HBASE-5970: - @ram How much time did you took to enable 2 regions, if regions is not very much, master handle fast enough, could you test with 100,000 regions? In our testing, RS open region fast, but master handle opened event slowly, and find 100,000 regions much much slower than 50,000 regions Another reason, recently we found RS open regions very slowly after 0.92 if one table has much regions You could see the following code to get the detail {code} public RegionOpeningState openRegion(HRegionInfo region, int versionOfOfflineNode){ ... HTableDescriptor htd = this.tableDescriptors.get(region.getTableName()); ... public HTableDescriptor get(final String tablename){ ... long modtime = getTableInfoModtime(this.fs, this.rootdir, tablename); ... } } {code} getTableInfoModtime-getTableInfoPath-getTableInfoPath-FSUtils.listStatus() if one table has much regions, FSUtils.listStatus() will take much time, and opening region in parallel on the rs will be closed to serially. So maybe we should do the improvement for above code. Improve the AssignmentManager#updateTimer and speed up handling opened event Key: HBASE-5970 URL: https://issues.apache.org/jira/browse/HBASE-5970 Project: HBase Issue Type: Improvement Components: master Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.96.0 Attachments: 5970v3.patch, HBASE-5970.patch, HBASE-5970v2.patch, HBASE-5970v3.patch, HBASE-5970v4.patch, HBASE-5970v4.patch We found handing opened event very slow in the environment with lots of regions. The problem is the slow AssignmentManager#updateTimer. We do the test for bulk assigning 10w (i.e. 100k) regions, the whole process of bulk assigning took 1 hours. 2012-05-06 20:31:49,201 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) round-robin across 5 server(s) 2012-05-06 21:26:32,103 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done I think we could do the improvement for the AssignmentManager#updateTimer: Make a thread do this work. After the improvement, it took only 4.5mins 2012-05-07 11:03:36,581 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) across 5 server(s), retainAssignment=true 2012-05-07 11:07:57,073 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5970) Improve the AssignmentManager#updateTimer and speed up handling opened event
[ https://issues.apache.org/jira/browse/HBASE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13294072#comment-13294072 ] Zhihong Ted Yu commented on HBASE-5970: --- @Chunhui: You can open a new issue for improving the above code. Improve the AssignmentManager#updateTimer and speed up handling opened event Key: HBASE-5970 URL: https://issues.apache.org/jira/browse/HBASE-5970 Project: HBase Issue Type: Improvement Components: master Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.96.0 Attachments: 5970v3.patch, HBASE-5970.patch, HBASE-5970v2.patch, HBASE-5970v3.patch, HBASE-5970v4.patch, HBASE-5970v4.patch We found handing opened event very slow in the environment with lots of regions. The problem is the slow AssignmentManager#updateTimer. We do the test for bulk assigning 10w (i.e. 100k) regions, the whole process of bulk assigning took 1 hours. 2012-05-06 20:31:49,201 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) round-robin across 5 server(s) 2012-05-06 21:26:32,103 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done I think we could do the improvement for the AssignmentManager#updateTimer: Make a thread do this work. After the improvement, it took only 4.5mins 2012-05-07 11:03:36,581 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) across 5 server(s), retainAssignment=true 2012-05-07 11:07:57,073 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5970) Improve the AssignmentManager#updateTimer and speed up handling opened event
[ https://issues.apache.org/jira/browse/HBASE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13294132#comment-13294132 ] ramkrishna.s.vasudevan commented on HBASE-5970: --- @Chunhui I think in trunk some change has been done in the code that you are suggesting by N? https://issues.apache.org/jira/browse/HBASE-5998. Improve the AssignmentManager#updateTimer and speed up handling opened event Key: HBASE-5970 URL: https://issues.apache.org/jira/browse/HBASE-5970 Project: HBase Issue Type: Improvement Components: master Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.96.0 Attachments: 5970v3.patch, HBASE-5970.patch, HBASE-5970v2.patch, HBASE-5970v3.patch, HBASE-5970v4.patch, HBASE-5970v4.patch We found handing opened event very slow in the environment with lots of regions. The problem is the slow AssignmentManager#updateTimer. We do the test for bulk assigning 10w (i.e. 100k) regions, the whole process of bulk assigning took 1 hours. 2012-05-06 20:31:49,201 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) round-robin across 5 server(s) 2012-05-06 21:26:32,103 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done I think we could do the improvement for the AssignmentManager#updateTimer: Make a thread do this work. After the improvement, it took only 4.5mins 2012-05-07 11:03:36,581 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) across 5 server(s), retainAssignment=true 2012-05-07 11:07:57,073 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5970) Improve the AssignmentManager#updateTimer and speed up handling opened event
[ https://issues.apache.org/jira/browse/HBASE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286370#comment-13286370 ] ramkrishna.s.vasudevan commented on HBASE-5970: --- @Chunhui As mentioned in that HBASE-5337 now when we start updating the timers at different times if suppose we introduce timeoutmonitor again to a lower value like 5 min then the update timers overall does not help becuase it delays the timeout monitor from acting after 5 mins. Just adding this so that later if we try to bring back the TM then we have to recheck this. Nice work Chunhui. Improve the AssignmentManager#updateTimer and speed up handling opened event Key: HBASE-5970 URL: https://issues.apache.org/jira/browse/HBASE-5970 Project: HBase Issue Type: Improvement Components: master Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.96.0 Attachments: 5970v3.patch, HBASE-5970.patch, HBASE-5970v2.patch, HBASE-5970v3.patch, HBASE-5970v4.patch, HBASE-5970v4.patch We found handing opened event very slow in the environment with lots of regions. The problem is the slow AssignmentManager#updateTimer. We do the test for bulk assigning 10w (i.e. 100k) regions, the whole process of bulk assigning took 1 hours. 2012-05-06 20:31:49,201 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) round-robin across 5 server(s) 2012-05-06 21:26:32,103 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done I think we could do the improvement for the AssignmentManager#updateTimer: Make a thread do this work. After the improvement, it took only 4.5mins 2012-05-07 11:03:36,581 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) across 5 server(s), retainAssignment=true 2012-05-07 11:07:57,073 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5970) Improve the AssignmentManager#updateTimer and speed up handling opened event
[ https://issues.apache.org/jira/browse/HBASE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286378#comment-13286378 ] chunhui shen commented on HBASE-5970: - @ram bq.it delays the timeout monitor from acting after 5 mins Yes, but we will delays the timeout monitor at most 10s( period is 10s as default), isn't it acceptable? {code} this.timerUpdater = new TimerUpdater(conf.getInt( +hbase.master.assignment.timerupdater.period, 1), master); {code} Improve the AssignmentManager#updateTimer and speed up handling opened event Key: HBASE-5970 URL: https://issues.apache.org/jira/browse/HBASE-5970 Project: HBase Issue Type: Improvement Components: master Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.96.0 Attachments: 5970v3.patch, HBASE-5970.patch, HBASE-5970v2.patch, HBASE-5970v3.patch, HBASE-5970v4.patch, HBASE-5970v4.patch We found handing opened event very slow in the environment with lots of regions. The problem is the slow AssignmentManager#updateTimer. We do the test for bulk assigning 10w (i.e. 100k) regions, the whole process of bulk assigning took 1 hours. 2012-05-06 20:31:49,201 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) round-robin across 5 server(s) 2012-05-06 21:26:32,103 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done I think we could do the improvement for the AssignmentManager#updateTimer: Make a thread do this work. After the improvement, it took only 4.5mins 2012-05-07 11:03:36,581 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) across 5 server(s), retainAssignment=true 2012-05-07 11:07:57,073 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5970) Improve the AssignmentManager#updateTimer and speed up handling opened event
[ https://issues.apache.org/jira/browse/HBASE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286380#comment-13286380 ] ramkrishna.s.vasudevan commented on HBASE-5970: --- @Chunhui What i meant is overall the updatetimers itself affects the TM. The problem is updatetimer just updates all the RITs time with the current time. Suppose the TM saw the RIT at time x. Now if the timeout monitor is 5 mins. Exactly after x+5 we expect the TM to reassign. But the updatetimer will update all the regions in RIT. So even this region who should be assinged after x+5 will also get updated to the current time. So now the TM will be able to take action only after x+5+Delta. Currently as it is 30 mins we wont be much affected as no one will be waiting for 30 mins. Improve the AssignmentManager#updateTimer and speed up handling opened event Key: HBASE-5970 URL: https://issues.apache.org/jira/browse/HBASE-5970 Project: HBase Issue Type: Improvement Components: master Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.96.0 Attachments: 5970v3.patch, HBASE-5970.patch, HBASE-5970v2.patch, HBASE-5970v3.patch, HBASE-5970v4.patch, HBASE-5970v4.patch We found handing opened event very slow in the environment with lots of regions. The problem is the slow AssignmentManager#updateTimer. We do the test for bulk assigning 10w (i.e. 100k) regions, the whole process of bulk assigning took 1 hours. 2012-05-06 20:31:49,201 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) round-robin across 5 server(s) 2012-05-06 21:26:32,103 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done I think we could do the improvement for the AssignmentManager#updateTimer: Make a thread do this work. After the improvement, it took only 4.5mins 2012-05-07 11:03:36,581 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) across 5 server(s), retainAssignment=true 2012-05-07 11:07:57,073 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5970) Improve the AssignmentManager#updateTimer and speed up handling opened event
[ https://issues.apache.org/jira/browse/HBASE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286391#comment-13286391 ] Hudson commented on HBASE-5970: --- Integrated in HBase-TRUNK #2963 (See [https://builds.apache.org/job/HBase-TRUNK/2963/]) HBASE-5970 Improve the AssignmentManager#updateTimer and speed up handling opened event (Revision 1344569) Result = FAILURE stack : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java Improve the AssignmentManager#updateTimer and speed up handling opened event Key: HBASE-5970 URL: https://issues.apache.org/jira/browse/HBASE-5970 Project: HBase Issue Type: Improvement Components: master Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.96.0 Attachments: 5970v3.patch, HBASE-5970.patch, HBASE-5970v2.patch, HBASE-5970v3.patch, HBASE-5970v4.patch, HBASE-5970v4.patch We found handing opened event very slow in the environment with lots of regions. The problem is the slow AssignmentManager#updateTimer. We do the test for bulk assigning 10w (i.e. 100k) regions, the whole process of bulk assigning took 1 hours. 2012-05-06 20:31:49,201 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) round-robin across 5 server(s) 2012-05-06 21:26:32,103 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done I think we could do the improvement for the AssignmentManager#updateTimer: Make a thread do this work. After the improvement, it took only 4.5mins 2012-05-07 11:03:36,581 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) across 5 server(s), retainAssignment=true 2012-05-07 11:07:57,073 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5970) Improve the AssignmentManager#updateTimer and speed up handling opened event
[ https://issues.apache.org/jira/browse/HBASE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286392#comment-13286392 ] chunhui shen commented on HBASE-5970: - bq.So now the TM will be able to take action only after x+5+Delta We use updatetimers to prevent unnecessary multi assign region, it just updates the RITs time as the related server. Comparing to previous logic, here Delta is at most 10s. Anyway, multi assign is much more serious than delayed assign by TimeoutMonitor, And, we should try our best to ensure that there is no case to need use TimeoutMonitor assign region. Correct me if wrong, thanks Improve the AssignmentManager#updateTimer and speed up handling opened event Key: HBASE-5970 URL: https://issues.apache.org/jira/browse/HBASE-5970 Project: HBase Issue Type: Improvement Components: master Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.96.0 Attachments: 5970v3.patch, HBASE-5970.patch, HBASE-5970v2.patch, HBASE-5970v3.patch, HBASE-5970v4.patch, HBASE-5970v4.patch We found handing opened event very slow in the environment with lots of regions. The problem is the slow AssignmentManager#updateTimer. We do the test for bulk assigning 10w (i.e. 100k) regions, the whole process of bulk assigning took 1 hours. 2012-05-06 20:31:49,201 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) round-robin across 5 server(s) 2012-05-06 21:26:32,103 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done I think we could do the improvement for the AssignmentManager#updateTimer: Make a thread do this work. After the improvement, it took only 4.5mins 2012-05-07 11:03:36,581 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) across 5 server(s), retainAssignment=true 2012-05-07 11:07:57,073 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5970) Improve the AssignmentManager#updateTimer and speed up handling opened event
[ https://issues.apache.org/jira/browse/HBASE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286396#comment-13286396 ] ramkrishna.s.vasudevan commented on HBASE-5970: --- @Chunhui bq.And, we should try our best to ensure that there is no case to need use TimeoutMonitor assign region. Yes. This is what we are also trying to achieve. Many of our issues is also targeted on that. Improve the AssignmentManager#updateTimer and speed up handling opened event Key: HBASE-5970 URL: https://issues.apache.org/jira/browse/HBASE-5970 Project: HBase Issue Type: Improvement Components: master Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.96.0 Attachments: 5970v3.patch, HBASE-5970.patch, HBASE-5970v2.patch, HBASE-5970v3.patch, HBASE-5970v4.patch, HBASE-5970v4.patch We found handing opened event very slow in the environment with lots of regions. The problem is the slow AssignmentManager#updateTimer. We do the test for bulk assigning 10w (i.e. 100k) regions, the whole process of bulk assigning took 1 hours. 2012-05-06 20:31:49,201 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) round-robin across 5 server(s) 2012-05-06 21:26:32,103 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done I think we could do the improvement for the AssignmentManager#updateTimer: Make a thread do this work. After the improvement, it took only 4.5mins 2012-05-07 11:03:36,581 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) across 5 server(s), retainAssignment=true 2012-05-07 11:07:57,073 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5970) Improve the AssignmentManager#updateTimer and speed up handling opened event
[ https://issues.apache.org/jira/browse/HBASE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286509#comment-13286509 ] Hudson commented on HBASE-5970: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #34 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/34/]) HBASE-5970 Improve the AssignmentManager#updateTimer and speed up handling opened event (Revision 1344569) Result = FAILURE stack : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java Improve the AssignmentManager#updateTimer and speed up handling opened event Key: HBASE-5970 URL: https://issues.apache.org/jira/browse/HBASE-5970 Project: HBase Issue Type: Improvement Components: master Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.96.0 Attachments: 5970v3.patch, HBASE-5970.patch, HBASE-5970v2.patch, HBASE-5970v3.patch, HBASE-5970v4.patch, HBASE-5970v4.patch We found handing opened event very slow in the environment with lots of regions. The problem is the slow AssignmentManager#updateTimer. We do the test for bulk assigning 10w (i.e. 100k) regions, the whole process of bulk assigning took 1 hours. 2012-05-06 20:31:49,201 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) round-robin across 5 server(s) 2012-05-06 21:26:32,103 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done I think we could do the improvement for the AssignmentManager#updateTimer: Make a thread do this work. After the improvement, it took only 4.5mins 2012-05-07 11:03:36,581 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) across 5 server(s), retainAssignment=true 2012-05-07 11:07:57,073 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5970) Improve the AssignmentManager#updateTimer and speed up handling opened event
[ https://issues.apache.org/jira/browse/HBASE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286197#comment-13286197 ] stack commented on HBASE-5970: -- What do we need to do to get this patch in? Can you update it for trunk please Chunhui so we can rerun it by hadoopqa. On the javadoc suggestion? I'm not sure what you'd write but write something that makes sense (the javadoc as is is nonesensical). Improve the AssignmentManager#updateTimer and speed up handling opened event Key: HBASE-5970 URL: https://issues.apache.org/jira/browse/HBASE-5970 Project: HBase Issue Type: Improvement Components: master Reporter: chunhui shen Assignee: chunhui shen Attachments: 5970v3.patch, HBASE-5970.patch, HBASE-5970v2.patch, HBASE-5970v3.patch We found handing opened event very slow in the environment with lots of regions. The problem is the slow AssignmentManager#updateTimer. We do the test for bulk assigning 10w (i.e. 100k) regions, the whole process of bulk assigning took 1 hours. 2012-05-06 20:31:49,201 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) round-robin across 5 server(s) 2012-05-06 21:26:32,103 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done I think we could do the improvement for the AssignmentManager#updateTimer: Make a thread do this work. After the improvement, it took only 4.5mins 2012-05-07 11:03:36,581 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) across 5 server(s), retainAssignment=true 2012-05-07 11:07:57,073 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5970) Improve the AssignmentManager#updateTimer and speed up handling opened event
[ https://issues.apache.org/jira/browse/HBASE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286282#comment-13286282 ] Hadoop QA commented on HBASE-5970: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12530318/HBASE-5970v4.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.master.TestAssignmentManager Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2068//testReport/ Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2068//console This message is automatically generated. Improve the AssignmentManager#updateTimer and speed up handling opened event Key: HBASE-5970 URL: https://issues.apache.org/jira/browse/HBASE-5970 Project: HBase Issue Type: Improvement Components: master Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Attachments: 5970v3.patch, HBASE-5970.patch, HBASE-5970v2.patch, HBASE-5970v3.patch, HBASE-5970v4.patch We found handing opened event very slow in the environment with lots of regions. The problem is the slow AssignmentManager#updateTimer. We do the test for bulk assigning 10w (i.e. 100k) regions, the whole process of bulk assigning took 1 hours. 2012-05-06 20:31:49,201 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) round-robin across 5 server(s) 2012-05-06 21:26:32,103 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done I think we could do the improvement for the AssignmentManager#updateTimer: Make a thread do this work. After the improvement, it took only 4.5mins 2012-05-07 11:03:36,581 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) across 5 server(s), retainAssignment=true 2012-05-07 11:07:57,073 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5970) Improve the AssignmentManager#updateTimer and speed up handling opened event
[ https://issues.apache.org/jira/browse/HBASE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286297#comment-13286297 ] stack commented on HBASE-5970: -- Chunhui Do you think the failure in AM because of this patch? Should we retry it against hadoopqa? I was wondering what bound we have on the number of threads when splitting out on the regionserver? It should be low by default I'd say. Good stuff. Improve the AssignmentManager#updateTimer and speed up handling opened event Key: HBASE-5970 URL: https://issues.apache.org/jira/browse/HBASE-5970 Project: HBase Issue Type: Improvement Components: master Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Attachments: 5970v3.patch, HBASE-5970.patch, HBASE-5970v2.patch, HBASE-5970v3.patch, HBASE-5970v4.patch We found handing opened event very slow in the environment with lots of regions. The problem is the slow AssignmentManager#updateTimer. We do the test for bulk assigning 10w (i.e. 100k) regions, the whole process of bulk assigning took 1 hours. 2012-05-06 20:31:49,201 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) round-robin across 5 server(s) 2012-05-06 21:26:32,103 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done I think we could do the improvement for the AssignmentManager#updateTimer: Make a thread do this work. After the improvement, it took only 4.5mins 2012-05-07 11:03:36,581 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) across 5 server(s), retainAssignment=true 2012-05-07 11:07:57,073 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5970) Improve the AssignmentManager#updateTimer and speed up handling opened event
[ https://issues.apache.org/jira/browse/HBASE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286324#comment-13286324 ] chunhui shen commented on HBASE-5970: - bq.I was wondering what bound we have on the number of threads when splitting out on the regionserver? It should be low by default I'd say. Sorry, Stack, I'm not clear about this, is it related to this issue? Improve the AssignmentManager#updateTimer and speed up handling opened event Key: HBASE-5970 URL: https://issues.apache.org/jira/browse/HBASE-5970 Project: HBase Issue Type: Improvement Components: master Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Attachments: 5970v3.patch, HBASE-5970.patch, HBASE-5970v2.patch, HBASE-5970v3.patch, HBASE-5970v4.patch, HBASE-5970v4.patch We found handing opened event very slow in the environment with lots of regions. The problem is the slow AssignmentManager#updateTimer. We do the test for bulk assigning 10w (i.e. 100k) regions, the whole process of bulk assigning took 1 hours. 2012-05-06 20:31:49,201 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) round-robin across 5 server(s) 2012-05-06 21:26:32,103 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done I think we could do the improvement for the AssignmentManager#updateTimer: Make a thread do this work. After the improvement, it took only 4.5mins 2012-05-07 11:03:36,581 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) across 5 server(s), retainAssignment=true 2012-05-07 11:07:57,073 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5970) Improve the AssignmentManager#updateTimer and speed up handling opened event
[ https://issues.apache.org/jira/browse/HBASE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286326#comment-13286326 ] stack commented on HBASE-5970: -- bq. Sorry, Stack, I'm not clear about this, is it related to this issue? It is. I found the answer reviewing the last patch up on rb (answer is 3). Thanks Improve the AssignmentManager#updateTimer and speed up handling opened event Key: HBASE-5970 URL: https://issues.apache.org/jira/browse/HBASE-5970 Project: HBase Issue Type: Improvement Components: master Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Attachments: 5970v3.patch, HBASE-5970.patch, HBASE-5970v2.patch, HBASE-5970v3.patch, HBASE-5970v4.patch, HBASE-5970v4.patch We found handing opened event very slow in the environment with lots of regions. The problem is the slow AssignmentManager#updateTimer. We do the test for bulk assigning 10w (i.e. 100k) regions, the whole process of bulk assigning took 1 hours. 2012-05-06 20:31:49,201 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) round-robin across 5 server(s) 2012-05-06 21:26:32,103 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done I think we could do the improvement for the AssignmentManager#updateTimer: Make a thread do this work. After the improvement, it took only 4.5mins 2012-05-07 11:03:36,581 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) across 5 server(s), retainAssignment=true 2012-05-07 11:07:57,073 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5970) Improve the AssignmentManager#updateTimer and speed up handling opened event
[ https://issues.apache.org/jira/browse/HBASE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286342#comment-13286342 ] chunhui shen commented on HBASE-5970: - I run the failed test case(org.apache.hadoop.hbase.master.TestAssignmentManager) and passed on the local PC. Also I think I found the reason why it failed: caused by TestAssignmentManager#testRegionPlanIsUpdatedWhenRegionFailsToOpen At the last of testRegionPlanIsUpdatedWhenRegionFailsToOpen, am will handle region with the state RS_ZK_REGION_FAILED_OPEN, in the AssignmentManager#handleRegion() {code} case RS_ZK_REGION_FAILED_OPEN: ... this.executorService.submit(new ClosedRegionHandler(master, this, regionState.getRegion())); {code} so we call a thread to execute ClosedRegionHandler in the background, and it create a node after testRegionPlanIsUpdatedWhenRegionFailsToOpen execute {code}public void after() throws KeeperException { if (this.watcher != null) { // Clean up all znodes ZKAssign.deleteAllNodes(this.watcher); this.watcher.close(); } }{code} Hence, it will fail with a probability and nothing with this patch. Improve the AssignmentManager#updateTimer and speed up handling opened event Key: HBASE-5970 URL: https://issues.apache.org/jira/browse/HBASE-5970 Project: HBase Issue Type: Improvement Components: master Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Attachments: 5970v3.patch, HBASE-5970.patch, HBASE-5970v2.patch, HBASE-5970v3.patch, HBASE-5970v4.patch, HBASE-5970v4.patch We found handing opened event very slow in the environment with lots of regions. The problem is the slow AssignmentManager#updateTimer. We do the test for bulk assigning 10w (i.e. 100k) regions, the whole process of bulk assigning took 1 hours. 2012-05-06 20:31:49,201 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) round-robin across 5 server(s) 2012-05-06 21:26:32,103 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done I think we could do the improvement for the AssignmentManager#updateTimer: Make a thread do this work. After the improvement, it took only 4.5mins 2012-05-07 11:03:36,581 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) across 5 server(s), retainAssignment=true 2012-05-07 11:07:57,073 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5970) Improve the AssignmentManager#updateTimer and speed up handling opened event
[ https://issues.apache.org/jira/browse/HBASE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286346#comment-13286346 ] stack commented on HBASE-5970: -- Excellent Chunhui. Can you file a new issue w/ what you found on AM (any chance of a patch to fix it?) Also, you are correct above when you think my question about 'number of threads' belonged to another issue. Pardon my being confused. Let me commit this patch. Improve the AssignmentManager#updateTimer and speed up handling opened event Key: HBASE-5970 URL: https://issues.apache.org/jira/browse/HBASE-5970 Project: HBase Issue Type: Improvement Components: master Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.96.0 Attachments: 5970v3.patch, HBASE-5970.patch, HBASE-5970v2.patch, HBASE-5970v3.patch, HBASE-5970v4.patch, HBASE-5970v4.patch We found handing opened event very slow in the environment with lots of regions. The problem is the slow AssignmentManager#updateTimer. We do the test for bulk assigning 10w (i.e. 100k) regions, the whole process of bulk assigning took 1 hours. 2012-05-06 20:31:49,201 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) round-robin across 5 server(s) 2012-05-06 21:26:32,103 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done I think we could do the improvement for the AssignmentManager#updateTimer: Make a thread do this work. After the improvement, it took only 4.5mins 2012-05-07 11:03:36,581 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) across 5 server(s), retainAssignment=true 2012-05-07 11:07:57,073 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5970) Improve the AssignmentManager#updateTimer and speed up handling opened event
[ https://issues.apache.org/jira/browse/HBASE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286347#comment-13286347 ] stack commented on HBASE-5970: -- Committed to trunk. Thanks for the patch Chunhui. Improve the AssignmentManager#updateTimer and speed up handling opened event Key: HBASE-5970 URL: https://issues.apache.org/jira/browse/HBASE-5970 Project: HBase Issue Type: Improvement Components: master Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.96.0 Attachments: 5970v3.patch, HBASE-5970.patch, HBASE-5970v2.patch, HBASE-5970v3.patch, HBASE-5970v4.patch, HBASE-5970v4.patch We found handing opened event very slow in the environment with lots of regions. The problem is the slow AssignmentManager#updateTimer. We do the test for bulk assigning 10w (i.e. 100k) regions, the whole process of bulk assigning took 1 hours. 2012-05-06 20:31:49,201 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) round-robin across 5 server(s) 2012-05-06 21:26:32,103 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done I think we could do the improvement for the AssignmentManager#updateTimer: Make a thread do this work. After the improvement, it took only 4.5mins 2012-05-07 11:03:36,581 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) across 5 server(s), retainAssignment=true 2012-05-07 11:07:57,073 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5970) Improve the AssignmentManager#updateTimer and speed up handling opened event
[ https://issues.apache.org/jira/browse/HBASE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286354#comment-13286354 ] Hadoop QA commented on HBASE-5970: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12530329/HBASE-5970v4.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.master.TestSplitLogManager Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2071//testReport/ Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2071//console This message is automatically generated. Improve the AssignmentManager#updateTimer and speed up handling opened event Key: HBASE-5970 URL: https://issues.apache.org/jira/browse/HBASE-5970 Project: HBase Issue Type: Improvement Components: master Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.96.0 Attachments: 5970v3.patch, HBASE-5970.patch, HBASE-5970v2.patch, HBASE-5970v3.patch, HBASE-5970v4.patch, HBASE-5970v4.patch We found handing opened event very slow in the environment with lots of regions. The problem is the slow AssignmentManager#updateTimer. We do the test for bulk assigning 10w (i.e. 100k) regions, the whole process of bulk assigning took 1 hours. 2012-05-06 20:31:49,201 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) round-robin across 5 server(s) 2012-05-06 21:26:32,103 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done I think we could do the improvement for the AssignmentManager#updateTimer: Make a thread do this work. After the improvement, it took only 4.5mins 2012-05-07 11:03:36,581 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) across 5 server(s), retainAssignment=true 2012-05-07 11:07:57,073 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5970) Improve the AssignmentManager#updateTimer and speed up handling opened event
[ https://issues.apache.org/jira/browse/HBASE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280367#comment-13280367 ] nkeywal commented on HBASE-5970: Hi, Could you share the logs of the tests? I would be interested to have a look at them. The javadoc for updateTimers says it's not used for bulk assignment, is there a mix of regions 'bulk assigned' and other regions? I see as well in the description that the time was once with 'retainAssignment=true' and once without. Are the results comparable in both cases? Thank you! Improve the AssignmentManager#updateTimer and speed up handling opened event Key: HBASE-5970 URL: https://issues.apache.org/jira/browse/HBASE-5970 Project: HBase Issue Type: Improvement Components: master Reporter: chunhui shen Assignee: chunhui shen Attachments: 5970v3.patch, HBASE-5970.patch, HBASE-5970v2.patch, HBASE-5970v3.patch We found handing opened event very slow in the environment with lots of regions. The problem is the slow AssignmentManager#updateTimer. We do the test for bulk assigning 10w (i.e. 100k) regions, the whole process of bulk assigning took 1 hours. 2012-05-06 20:31:49,201 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) round-robin across 5 server(s) 2012-05-06 21:26:32,103 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done I think we could do the improvement for the AssignmentManager#updateTimer: Make a thread do this work. After the improvement, it took only 4.5mins 2012-05-07 11:03:36,581 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) across 5 server(s), retainAssignment=true 2012-05-07 11:07:57,073 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5970) Improve the AssignmentManager#updateTimer and speed up handling opened event
[ https://issues.apache.org/jira/browse/HBASE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13279526#comment-13279526 ] chunhui shen commented on HBASE-5970: - bq.Why is doing this work in the background faster? Is it just that it being inline, it takes a noticeable amount of time? AssignmentManager#updateTimers would check each region in RIT, and there are lots of regions in the RIT when startup, so it will took much time(about 30ms if 100k regions in RIT). However, in the current logic, we will do the AssignmentManager#updateTimers for each opened event, causing the whole process of handling all the opened events tooks much much time. If we do the updateTimers in the background, we needn't wait it when handling opened event. Also, we wouldn't updateTimers for the same Regionserver in a short time.(In fact, when cluster startup , lots of opened events from the same regionserver at the same time, we needn't do this work many times at the moment) Improve the AssignmentManager#updateTimer and speed up handling opened event Key: HBASE-5970 URL: https://issues.apache.org/jira/browse/HBASE-5970 Project: HBase Issue Type: Improvement Components: master Reporter: chunhui shen Assignee: chunhui shen Attachments: 5970v3.patch, HBASE-5970.patch, HBASE-5970v2.patch, HBASE-5970v3.patch We found handing opened event very slow in the environment with lots of regions. The problem is the slow AssignmentManager#updateTimer. We do the test for bulk assigning 10w (i.e. 100k) regions, the whole process of bulk assigning took 1 hours. 2012-05-06 20:31:49,201 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) round-robin across 5 server(s) 2012-05-06 21:26:32,103 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done I think we could do the improvement for the AssignmentManager#updateTimer: Make a thread do this work. After the improvement, it took only 4.5mins 2012-05-07 11:03:36,581 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) across 5 server(s), retainAssignment=true 2012-05-07 11:07:57,073 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5970) Improve the AssignmentManager#updateTimer and speed up handling opened event
[ https://issues.apache.org/jira/browse/HBASE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13279531#comment-13279531 ] chunhui shen commented on HBASE-5970: - @stack bq.You are addressing this comment in updateTimers?{code}// This loop could be expensive.{code} I keep this loop, but greatly reduce the call times bq.Is this javadoc right? I'm not sure, could you give a suggestion. Thanks for the review Improve the AssignmentManager#updateTimer and speed up handling opened event Key: HBASE-5970 URL: https://issues.apache.org/jira/browse/HBASE-5970 Project: HBase Issue Type: Improvement Components: master Reporter: chunhui shen Assignee: chunhui shen Attachments: 5970v3.patch, HBASE-5970.patch, HBASE-5970v2.patch, HBASE-5970v3.patch We found handing opened event very slow in the environment with lots of regions. The problem is the slow AssignmentManager#updateTimer. We do the test for bulk assigning 10w (i.e. 100k) regions, the whole process of bulk assigning took 1 hours. 2012-05-06 20:31:49,201 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) round-robin across 5 server(s) 2012-05-06 21:26:32,103 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done I think we could do the improvement for the AssignmentManager#updateTimer: Make a thread do this work. After the improvement, it took only 4.5mins 2012-05-07 11:03:36,581 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) across 5 server(s), retainAssignment=true 2012-05-07 11:07:57,073 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5970) Improve the AssignmentManager#updateTimer and speed up handling opened event
[ https://issues.apache.org/jira/browse/HBASE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13279118#comment-13279118 ] stack commented on HBASE-5970: -- I like the effect this patch has. You are addressing this comment in updateTimers? {code} // This loop could be expensive. {code} Should this be in the new timer class? {code} + private final ConcurrentSkipListSetServerName serversInUpdatingTimer = +new ConcurrentSkipListSetServerName(); {code} Is this javadoc right? {code} + * Add the server to serversInUpdatingTimer, and wait {@link TimerUpdater} to + * update timers for all regions in transition going against this server. {code} Why is doing this work in the background faster? Is it just that it being inline, it takes a noticeable amount of time? This is a nice improvement. Improve the AssignmentManager#updateTimer and speed up handling opened event Key: HBASE-5970 URL: https://issues.apache.org/jira/browse/HBASE-5970 Project: HBase Issue Type: Improvement Components: master Reporter: chunhui shen Assignee: chunhui shen Attachments: 5970v3.patch, HBASE-5970.patch, HBASE-5970v2.patch, HBASE-5970v3.patch We found handing opened event very slow in the environment with lots of regions. The problem is the slow AssignmentManager#updateTimer. We do the test for bulk assigning 10w (i.e. 100k) regions, the whole process of bulk assigning took 1 hours. 2012-05-06 20:31:49,201 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) round-robin across 5 server(s) 2012-05-06 21:26:32,103 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done I think we could do the improvement for the AssignmentManager#updateTimer: Make a thread do this work. After the improvement, it took only 4.5mins 2012-05-07 11:03:36,581 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) across 5 server(s), retainAssignment=true 2012-05-07 11:07:57,073 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5970) Improve the AssignmentManager#updateTimer and speed up handling opened event
[ https://issues.apache.org/jira/browse/HBASE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13276090#comment-13276090 ] Hadoop QA commented on HBASE-5970: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12526715/HBASE-5970v3.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop23. The patch compiles against the hadoop 0.23.x profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 31 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1872//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1872//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1872//console This message is automatically generated. Improve the AssignmentManager#updateTimer and speed up handling opened event Key: HBASE-5970 URL: https://issues.apache.org/jira/browse/HBASE-5970 Project: HBase Issue Type: Improvement Components: master Reporter: chunhui shen Assignee: chunhui shen Attachments: HBASE-5970.patch, HBASE-5970v2.patch, HBASE-5970v3.patch We found handing opened event very slow in the environment with lots of regions. The problem is the slow AssignmentManager#updateTimer. We do the test for bulk assigning 10w (i.e. 100k) regions, the whole process of bulk assigning took 1 hours. 2012-05-06 20:31:49,201 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) round-robin across 5 server(s) 2012-05-06 21:26:32,103 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done I think we could do the improvement for the AssignmentManager#updateTimer: Make a thread do this work. After the improvement, it took only 4.5mins 2012-05-07 11:03:36,581 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) across 5 server(s), retainAssignment=true 2012-05-07 11:07:57,073 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5970) Improve the AssignmentManager#updateTimer and speed up handling opened event
[ https://issues.apache.org/jira/browse/HBASE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13276272#comment-13276272 ] Hadoop QA commented on HBASE-5970: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12527518/5970v3.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop23. The patch compiles against the hadoop 0.23.x profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 31 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1877//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1877//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1877//console This message is automatically generated. Improve the AssignmentManager#updateTimer and speed up handling opened event Key: HBASE-5970 URL: https://issues.apache.org/jira/browse/HBASE-5970 Project: HBase Issue Type: Improvement Components: master Reporter: chunhui shen Assignee: chunhui shen Attachments: 5970v3.patch, HBASE-5970.patch, HBASE-5970v2.patch, HBASE-5970v3.patch We found handing opened event very slow in the environment with lots of regions. The problem is the slow AssignmentManager#updateTimer. We do the test for bulk assigning 10w (i.e. 100k) regions, the whole process of bulk assigning took 1 hours. 2012-05-06 20:31:49,201 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) round-robin across 5 server(s) 2012-05-06 21:26:32,103 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done I think we could do the improvement for the AssignmentManager#updateTimer: Make a thread do this work. After the improvement, it took only 4.5mins 2012-05-07 11:03:36,581 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) across 5 server(s), retainAssignment=true 2012-05-07 11:07:57,073 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5970) Improve the AssignmentManager#updateTimer and speed up handling opened event
[ https://issues.apache.org/jira/browse/HBASE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13271127#comment-13271127 ] chunhui shen commented on HBASE-5970: - I think we don't need to update the timer so active. Waiting for the next chore() could reduce much consume if there are lots of opened event, because they may update the same server. Improve the AssignmentManager#updateTimer and speed up handling opened event Key: HBASE-5970 URL: https://issues.apache.org/jira/browse/HBASE-5970 Project: HBase Issue Type: Improvement Components: master Reporter: chunhui shen Assignee: chunhui shen Attachments: HBASE-5970.patch We found handing opened event very slow in the environment with lots of regions. The problem is the slow AssignmentManager#updateTimer. We do the test for bulk assigning 10w (i.e. 100k) regions, the whole process of bulk assigning took 1 hours. 2012-05-06 20:31:49,201 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) round-robin across 5 server(s) 2012-05-06 21:26:32,103 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done I think we could do the improvement for the AssignmentManager#updateTimer: Make a thread do this work. After the improvement, it took only 4.5mins 2012-05-07 11:03:36,581 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) across 5 server(s), retainAssignment=true 2012-05-07 11:07:57,073 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5970) Improve the AssignmentManager#updateTimer and speed up handling opened event
[ https://issues.apache.org/jira/browse/HBASE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13271153#comment-13271153 ] Zhihong Yu commented on HBASE-5970: --- The above assumption is reasonable. Improve the AssignmentManager#updateTimer and speed up handling opened event Key: HBASE-5970 URL: https://issues.apache.org/jira/browse/HBASE-5970 Project: HBase Issue Type: Improvement Components: master Reporter: chunhui shen Assignee: chunhui shen Attachments: HBASE-5970.patch We found handing opened event very slow in the environment with lots of regions. The problem is the slow AssignmentManager#updateTimer. We do the test for bulk assigning 10w (i.e. 100k) regions, the whole process of bulk assigning took 1 hours. 2012-05-06 20:31:49,201 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) round-robin across 5 server(s) 2012-05-06 21:26:32,103 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done I think we could do the improvement for the AssignmentManager#updateTimer: Make a thread do this work. After the improvement, it took only 4.5mins 2012-05-07 11:03:36,581 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) across 5 server(s), retainAssignment=true 2012-05-07 11:07:57,073 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5970) Improve the AssignmentManager#updateTimer and speed up handling opened event
[ https://issues.apache.org/jira/browse/HBASE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13271983#comment-13271983 ] Ming Ma commented on HBASE-5970: Looks good. Why do we need true in while (true !stopper.isStopped())? Improve the AssignmentManager#updateTimer and speed up handling opened event Key: HBASE-5970 URL: https://issues.apache.org/jira/browse/HBASE-5970 Project: HBase Issue Type: Improvement Components: master Reporter: chunhui shen Assignee: chunhui shen Attachments: HBASE-5970.patch We found handing opened event very slow in the environment with lots of regions. The problem is the slow AssignmentManager#updateTimer. We do the test for bulk assigning 10w (i.e. 100k) regions, the whole process of bulk assigning took 1 hours. 2012-05-06 20:31:49,201 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) round-robin across 5 server(s) 2012-05-06 21:26:32,103 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done I think we could do the improvement for the AssignmentManager#updateTimer: Make a thread do this work. After the improvement, it took only 4.5mins 2012-05-07 11:03:36,581 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) across 5 server(s), retainAssignment=true 2012-05-07 11:07:57,073 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5970) Improve the AssignmentManager#updateTimer and speed up handling opened event
[ https://issues.apache.org/jira/browse/HBASE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13272025#comment-13272025 ] chunhui shen commented on HBASE-5970: - bq.Why do we need true in while (true !stopper.isStopped())? Sorry, make such a mistake, correct it in patch v2. Thanks for the review Improve the AssignmentManager#updateTimer and speed up handling opened event Key: HBASE-5970 URL: https://issues.apache.org/jira/browse/HBASE-5970 Project: HBase Issue Type: Improvement Components: master Reporter: chunhui shen Assignee: chunhui shen Attachments: HBASE-5970.patch We found handing opened event very slow in the environment with lots of regions. The problem is the slow AssignmentManager#updateTimer. We do the test for bulk assigning 10w (i.e. 100k) regions, the whole process of bulk assigning took 1 hours. 2012-05-06 20:31:49,201 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) round-robin across 5 server(s) 2012-05-06 21:26:32,103 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done I think we could do the improvement for the AssignmentManager#updateTimer: Make a thread do this work. After the improvement, it took only 4.5mins 2012-05-07 11:03:36,581 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) across 5 server(s), retainAssignment=true 2012-05-07 11:07:57,073 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5970) Improve the AssignmentManager#updateTimer and speed up handling opened event
[ https://issues.apache.org/jira/browse/HBASE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13271103#comment-13271103 ] ramkrishna.s.vasudevan commented on HBASE-5970: --- @Chunhui HBASE-5337-AM.updateTimers() delays the timeout monitor from assigning regions. I think i can duplicate it with this defect. Few comments serversInUpdatingTimer can be changed to updateTimerForServer? The order in which the updation should happen matters? {code} Threads.setDaemonThreadRunning(timerUpdater.getThread(), +master.getServerName() + .timerUpdater); {code} Can we just add this to one protected method. Actually we found that doing like that will help you in mocking the master in case of testing these scenarios? Patch looks good. Improve the AssignmentManager#updateTimer and speed up handling opened event Key: HBASE-5970 URL: https://issues.apache.org/jira/browse/HBASE-5970 Project: HBase Issue Type: Improvement Components: master Reporter: chunhui shen Assignee: chunhui shen Attachments: HBASE-5970.patch We found handing opened event very slow in the environment with lots of regions. The problem is the slow AssignmentManager#updateTimer. We do the test for bulk assigning 10w regions, the whole process of bulk assigning took 1 hours. 2012-05-06 20:31:49,201 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) round-robin across 5 server(s) 2012-05-06 21:26:32,103 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done I think we could do the improvement for the AssignmentManager#updateTimer: Make a thread do this work. After the improvement, it took only 4.5mins 2012-05-07 11:03:36,581 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) across 5 server(s), retainAssignment=true 2012-05-07 11:07:57,073 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5970) Improve the AssignmentManager#updateTimer and speed up handling opened event
[ https://issues.apache.org/jira/browse/HBASE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13271109#comment-13271109 ] chunhui shen commented on HBASE-5970: - @ram Thanks for the review. Let's see what others say, and I'll do the modify together. Improve the AssignmentManager#updateTimer and speed up handling opened event Key: HBASE-5970 URL: https://issues.apache.org/jira/browse/HBASE-5970 Project: HBase Issue Type: Improvement Components: master Reporter: chunhui shen Assignee: chunhui shen Attachments: HBASE-5970.patch We found handing opened event very slow in the environment with lots of regions. The problem is the slow AssignmentManager#updateTimer. We do the test for bulk assigning 10w regions, the whole process of bulk assigning took 1 hours. 2012-05-06 20:31:49,201 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) round-robin across 5 server(s) 2012-05-06 21:26:32,103 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done I think we could do the improvement for the AssignmentManager#updateTimer: Make a thread do this work. After the improvement, it took only 4.5mins 2012-05-07 11:03:36,581 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) across 5 server(s), retainAssignment=true 2012-05-07 11:07:57,073 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5970) Improve the AssignmentManager#updateTimer and speed up handling opened event
[ https://issues.apache.org/jira/browse/HBASE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13271113#comment-13271113 ] Zhihong Yu commented on HBASE-5970: --- {code} + serverToUpdateTimer = serversInUpdatingTimer + .higher(serverToUpdateTimer); +} +if (serverToUpdateTimer == null) { + break; {code} What would happen if a 'lower' ServerName gets its timers updated, gets removed from serversInUpdatingTimer and later is added back to serversInUpdatingTimer ? How would this server be picked up ? Improve the AssignmentManager#updateTimer and speed up handling opened event Key: HBASE-5970 URL: https://issues.apache.org/jira/browse/HBASE-5970 Project: HBase Issue Type: Improvement Components: master Reporter: chunhui shen Assignee: chunhui shen Attachments: HBASE-5970.patch We found handing opened event very slow in the environment with lots of regions. The problem is the slow AssignmentManager#updateTimer. We do the test for bulk assigning 10w regions, the whole process of bulk assigning took 1 hours. 2012-05-06 20:31:49,201 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) round-robin across 5 server(s) 2012-05-06 21:26:32,103 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done I think we could do the improvement for the AssignmentManager#updateTimer: Make a thread do this work. After the improvement, it took only 4.5mins 2012-05-07 11:03:36,581 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) across 5 server(s), retainAssignment=true 2012-05-07 11:07:57,073 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5970) Improve the AssignmentManager#updateTimer and speed up handling opened event
[ https://issues.apache.org/jira/browse/HBASE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13271115#comment-13271115 ] chunhui shen commented on HBASE-5970: - bq.What would happen if a 'lower' ServerName gets its timers updated, gets removed from serversInUpdatingTimer and later is added back to serversInUpdatingTimer ? If later it is added back to serversInUpdatingTimer, we will update timer again after next chore() Improve the AssignmentManager#updateTimer and speed up handling opened event Key: HBASE-5970 URL: https://issues.apache.org/jira/browse/HBASE-5970 Project: HBase Issue Type: Improvement Components: master Reporter: chunhui shen Assignee: chunhui shen Attachments: HBASE-5970.patch We found handing opened event very slow in the environment with lots of regions. The problem is the slow AssignmentManager#updateTimer. We do the test for bulk assigning 10w regions, the whole process of bulk assigning took 1 hours. 2012-05-06 20:31:49,201 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) round-robin across 5 server(s) 2012-05-06 21:26:32,103 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done I think we could do the improvement for the AssignmentManager#updateTimer: Make a thread do this work. After the improvement, it took only 4.5mins 2012-05-07 11:03:36,581 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) across 5 server(s), retainAssignment=true 2012-05-07 11:07:57,073 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5970) Improve the AssignmentManager#updateTimer and speed up handling opened event
[ https://issues.apache.org/jira/browse/HBASE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13271119#comment-13271119 ] Zhihong Yu commented on HBASE-5970: --- That's right. But I wonder why we need to wait for the next chore() ? Improve the AssignmentManager#updateTimer and speed up handling opened event Key: HBASE-5970 URL: https://issues.apache.org/jira/browse/HBASE-5970 Project: HBase Issue Type: Improvement Components: master Reporter: chunhui shen Assignee: chunhui shen Attachments: HBASE-5970.patch We found handing opened event very slow in the environment with lots of regions. The problem is the slow AssignmentManager#updateTimer. We do the test for bulk assigning 10w regions, the whole process of bulk assigning took 1 hours. 2012-05-06 20:31:49,201 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) round-robin across 5 server(s) 2012-05-06 21:26:32,103 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done I think we could do the improvement for the AssignmentManager#updateTimer: Make a thread do this work. After the improvement, it took only 4.5mins 2012-05-07 11:03:36,581 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 region(s) across 5 server(s), retainAssignment=true 2012-05-07 11:07:57,073 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira