[jira] [Commented] (HBASE-7247) Assignment performances decreased by 50% because of regionserver.OpenRegionHandler#tickleOpening

2013-04-10 Thread Nicolas Liochon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13627662#comment-13627662
 ] 

Nicolas Liochon commented on HBASE-7247:


bq. This sync happens every time? Ain't it really expensive? More expensive 
that what used to be there (a write?). Why we need it?
We already had the sync previously. We're just saving the final write.
The sync can be expensive, I don't know if it's always expensive or sometimes 
optimized by ZK. My test hides its cost, because it's a single ZK node and the 
master is on the same node. A real life system with 5 ZK would be more 
realistic. In any case, saving the write is good.

bq. Does this clear any existing watch?
No, it does not set a watcher but should not impact existing ones.

bq. Does the content of retransitionNodeOpening deserve to be generalized and 
moved back into transitionNode but with a flag for whether to write the new 
state or not?

I don't know. Do we have a lot of case like this one?
Globally, the performance problem is that we check the state before writing, 
hence we sync and we read. The next optimization I'm thinking about is to know 
that we were the previous writer, or that we read the previous version already, 
so we can send the write without syncing/reading, relying only on ZooKeeper 
versions. But it's easy to add bugs when doing this.


 Assignment performances decreased by 50% because of 
 regionserver.OpenRegionHandler#tickleOpening
 

 Key: HBASE-7247
 URL: https://issues.apache.org/jira/browse/HBASE-7247
 Project: HBase
  Issue Type: Improvement
  Components: master, Region Assignment, regionserver
Affects Versions: 0.95.2
Reporter: Nicolas Liochon
Assignee: Nicolas Liochon
 Fix For: 0.98.0, 0.95.1

 Attachments: 7247.v1.patch, 7247.v2.patch


 The regionserver.OpenRegionHandler#tickleOpening updates the region znode as 
 Do this so master doesn't timeout this region-in-transition..
 However, on the usual test, this makes the assignment time of 1500 regions 
 goes from 70s to 100s, that is, we're 50% slower because of this.
 More generally, ZooKeper commits to disk all the data update, and this takes 
 time. Using it to provide a keep alive seems overkill. At the very list, it 
 could be made asynchronous.
 I'm not sure how necessary these updates are required (I need to go deeper in 
 the internal, feedback welcome), but it seems very important to optimize 
 this... The trival fix would be to make this optional.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7247) Assignment performances decreased by 50% because of regionserver.OpenRegionHandler#tickleOpening

2013-04-10 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13628089#comment-13628089
 ] 

stack commented on HBASE-7247:
--

Thanks for explanation.  +1 on commit to trunk and 0.95.

 Assignment performances decreased by 50% because of 
 regionserver.OpenRegionHandler#tickleOpening
 

 Key: HBASE-7247
 URL: https://issues.apache.org/jira/browse/HBASE-7247
 Project: HBase
  Issue Type: Improvement
  Components: master, Region Assignment, regionserver
Affects Versions: 0.95.2
Reporter: Nicolas Liochon
Assignee: Nicolas Liochon
 Fix For: 0.98.0, 0.95.1

 Attachments: 7247.v1.patch, 7247.v2.patch


 The regionserver.OpenRegionHandler#tickleOpening updates the region znode as 
 Do this so master doesn't timeout this region-in-transition..
 However, on the usual test, this makes the assignment time of 1500 regions 
 goes from 70s to 100s, that is, we're 50% slower because of this.
 More generally, ZooKeper commits to disk all the data update, and this takes 
 time. Using it to provide a keep alive seems overkill. At the very list, it 
 could be made asynchronous.
 I'm not sure how necessary these updates are required (I need to go deeper in 
 the internal, feedback welcome), but it seems very important to optimize 
 this... The trival fix would be to make this optional.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7247) Assignment performances decreased by 50% because of regionserver.OpenRegionHandler#tickleOpening

2013-04-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626500#comment-13626500
 ] 

Hudson commented on HBASE-7247:
---

Integrated in HBase-TRUNK #4045 (See 
[https://builds.apache.org/job/HBase-TRUNK/4045/])
HBASE-7247  Assignment performances decreased by 50% because of 
regionserver.OpenRegionHandler#tickleOpening (Revision 1465914)

 Result = SUCCESS
nkeywal : 
Files : 
* 
/hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java


 Assignment performances decreased by 50% because of 
 regionserver.OpenRegionHandler#tickleOpening
 

 Key: HBASE-7247
 URL: https://issues.apache.org/jira/browse/HBASE-7247
 Project: HBase
  Issue Type: Improvement
  Components: master, Region Assignment, regionserver
Affects Versions: 0.95.2
Reporter: Nicolas Liochon
Assignee: Nicolas Liochon
 Fix For: 0.98.0, 0.95.1

 Attachments: 7247.v1.patch, 7247.v2.patch


 The regionserver.OpenRegionHandler#tickleOpening updates the region znode as 
 Do this so master doesn't timeout this region-in-transition..
 However, on the usual test, this makes the assignment time of 1500 regions 
 goes from 70s to 100s, that is, we're 50% slower because of this.
 More generally, ZooKeper commits to disk all the data update, and this takes 
 time. Using it to provide a keep alive seems overkill. At the very list, it 
 could be made asynchronous.
 I'm not sure how necessary these updates are required (I need to go deeper in 
 the internal, feedback welcome), but it seems very important to optimize 
 this... The trival fix would be to make this optional.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7247) Assignment performances decreased by 50% because of regionserver.OpenRegionHandler#tickleOpening

2013-04-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626594#comment-13626594
 ] 

Hudson commented on HBASE-7247:
---

Integrated in hbase-0.95-on-hadoop2 #62 (See 
[https://builds.apache.org/job/hbase-0.95-on-hadoop2/62/])
HBASE-7247  Assignment performances decreased by 50% because of 
regionserver.OpenRegionHandler#tickleOpening (Revision 1465915)

 Result = FAILURE
nkeywal : 
Files : 
* 
/hbase/branches/0.95/hbase-client/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java
* 
/hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java


 Assignment performances decreased by 50% because of 
 regionserver.OpenRegionHandler#tickleOpening
 

 Key: HBASE-7247
 URL: https://issues.apache.org/jira/browse/HBASE-7247
 Project: HBase
  Issue Type: Improvement
  Components: master, Region Assignment, regionserver
Affects Versions: 0.95.2
Reporter: Nicolas Liochon
Assignee: Nicolas Liochon
 Fix For: 0.98.0, 0.95.1

 Attachments: 7247.v1.patch, 7247.v2.patch


 The regionserver.OpenRegionHandler#tickleOpening updates the region znode as 
 Do this so master doesn't timeout this region-in-transition..
 However, on the usual test, this makes the assignment time of 1500 regions 
 goes from 70s to 100s, that is, we're 50% slower because of this.
 More generally, ZooKeper commits to disk all the data update, and this takes 
 time. Using it to provide a keep alive seems overkill. At the very list, it 
 could be made asynchronous.
 I'm not sure how necessary these updates are required (I need to go deeper in 
 the internal, feedback welcome), but it seems very important to optimize 
 this... The trival fix would be to make this optional.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7247) Assignment performances decreased by 50% because of regionserver.OpenRegionHandler#tickleOpening

2013-04-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626736#comment-13626736
 ] 

Hudson commented on HBASE-7247:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #489 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/489/])
HBASE-7247  Assignment performances decreased by 50% because of 
regionserver.OpenRegionHandler#tickleOpening (Revision 1465914)

 Result = FAILURE
nkeywal : 
Files : 
* 
/hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java


 Assignment performances decreased by 50% because of 
 regionserver.OpenRegionHandler#tickleOpening
 

 Key: HBASE-7247
 URL: https://issues.apache.org/jira/browse/HBASE-7247
 Project: HBase
  Issue Type: Improvement
  Components: master, Region Assignment, regionserver
Affects Versions: 0.95.2
Reporter: Nicolas Liochon
Assignee: Nicolas Liochon
 Fix For: 0.98.0, 0.95.1

 Attachments: 7247.v1.patch, 7247.v2.patch


 The regionserver.OpenRegionHandler#tickleOpening updates the region znode as 
 Do this so master doesn't timeout this region-in-transition..
 However, on the usual test, this makes the assignment time of 1500 regions 
 goes from 70s to 100s, that is, we're 50% slower because of this.
 More generally, ZooKeper commits to disk all the data update, and this takes 
 time. Using it to provide a keep alive seems overkill. At the very list, it 
 could be made asynchronous.
 I'm not sure how necessary these updates are required (I need to go deeper in 
 the internal, feedback welcome), but it seems very important to optimize 
 this... The trival fix would be to make this optional.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7247) Assignment performances decreased by 50% because of regionserver.OpenRegionHandler#tickleOpening

2013-04-09 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626784#comment-13626784
 ] 

stack commented on HBASE-7247:
--

[~nkeywal] This sync happens every time?  Ain't it really expensive?  More 
expensive that what used to be there (a write?).  Why we need it?

+String node = getNodeName(zkw, encoded);
+zkw.sync(node);

Does this clear any existing watch?

+byte [] existingBytes = ZKUtil.getDataNoWatch(zkw, node, stat);

Does the content of retransitionNodeOpening deserve to be generalized and moved 
back into transitionNode but with a flag for whether to write the new state or 
not?

Otherwise patch looks good.

 Assignment performances decreased by 50% because of 
 regionserver.OpenRegionHandler#tickleOpening
 

 Key: HBASE-7247
 URL: https://issues.apache.org/jira/browse/HBASE-7247
 Project: HBase
  Issue Type: Improvement
  Components: master, Region Assignment, regionserver
Affects Versions: 0.95.2
Reporter: Nicolas Liochon
Assignee: Nicolas Liochon
 Fix For: 0.98.0, 0.95.1

 Attachments: 7247.v1.patch, 7247.v2.patch


 The regionserver.OpenRegionHandler#tickleOpening updates the region znode as 
 Do this so master doesn't timeout this region-in-transition..
 However, on the usual test, this makes the assignment time of 1500 regions 
 goes from 70s to 100s, that is, we're 50% slower because of this.
 More generally, ZooKeper commits to disk all the data update, and this takes 
 time. Using it to provide a keep alive seems overkill. At the very list, it 
 could be made asynchronous.
 I'm not sure how necessary these updates are required (I need to go deeper in 
 the internal, feedback welcome), but it seems very important to optimize 
 this... The trival fix would be to make this optional.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7247) Assignment performances decreased by 50% because of regionserver.OpenRegionHandler#tickleOpening

2013-04-08 Thread Nicolas Liochon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13625224#comment-13625224
 ] 

Nicolas Liochon commented on HBASE-7247:


Waiting for a review on this one. Works locally, TestHTableMultiplexer is well 
known as flaky.

 Assignment performances decreased by 50% because of 
 regionserver.OpenRegionHandler#tickleOpening
 

 Key: HBASE-7247
 URL: https://issues.apache.org/jira/browse/HBASE-7247
 Project: HBase
  Issue Type: Improvement
  Components: master, Region Assignment, regionserver
Affects Versions: 0.95.2
Reporter: Nicolas Liochon
Assignee: Nicolas Liochon
 Fix For: 0.95.1

 Attachments: 7247.v1.patch, 7247.v2.patch


 The regionserver.OpenRegionHandler#tickleOpening updates the region znode as 
 Do this so master doesn't timeout this region-in-transition..
 However, on the usual test, this makes the assignment time of 1500 regions 
 goes from 70s to 100s, that is, we're 50% slower because of this.
 More generally, ZooKeper commits to disk all the data update, and this takes 
 time. Using it to provide a keep alive seems overkill. At the very list, it 
 could be made asynchronous.
 I'm not sure how necessary these updates are required (I need to go deeper in 
 the internal, feedback welcome), but it seems very important to optimize 
 this... The trival fix would be to make this optional.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7247) Assignment performances decreased by 50% because of regionserver.OpenRegionHandler#tickleOpening

2013-04-08 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13625517#comment-13625517
 ] 

Jimmy Xiang commented on HBASE-7247:


v2 Looks good to me.

 Assignment performances decreased by 50% because of 
 regionserver.OpenRegionHandler#tickleOpening
 

 Key: HBASE-7247
 URL: https://issues.apache.org/jira/browse/HBASE-7247
 Project: HBase
  Issue Type: Improvement
  Components: master, Region Assignment, regionserver
Affects Versions: 0.95.2
Reporter: Nicolas Liochon
Assignee: Nicolas Liochon
 Fix For: 0.95.1

 Attachments: 7247.v1.patch, 7247.v2.patch


 The regionserver.OpenRegionHandler#tickleOpening updates the region znode as 
 Do this so master doesn't timeout this region-in-transition..
 However, on the usual test, this makes the assignment time of 1500 regions 
 goes from 70s to 100s, that is, we're 50% slower because of this.
 More generally, ZooKeper commits to disk all the data update, and this takes 
 time. Using it to provide a keep alive seems overkill. At the very list, it 
 could be made asynchronous.
 I'm not sure how necessary these updates are required (I need to go deeper in 
 the internal, feedback welcome), but it seems very important to optimize 
 this... The trival fix would be to make this optional.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7247) Assignment performances decreased by 50% because of regionserver.OpenRegionHandler#tickleOpening

2013-04-05 Thread Nicolas Liochon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13623505#comment-13623505
 ] 

Nicolas Liochon commented on HBASE-7247:


v2. Results are consistent: always 85s vs. 120s. I've done a yourkit profiling 
against it: all the time is spent in ZK in my test scenario.

 Assignment performances decreased by 50% because of 
 regionserver.OpenRegionHandler#tickleOpening
 

 Key: HBASE-7247
 URL: https://issues.apache.org/jira/browse/HBASE-7247
 Project: HBase
  Issue Type: Improvement
  Components: master, Region Assignment, regionserver
Affects Versions: 0.96.0
Reporter: Nicolas Liochon
Assignee: Nicolas Liochon
 Fix For: 0.95.1

 Attachments: 7247.v1.patch, 7247.v2.patch


 The regionserver.OpenRegionHandler#tickleOpening updates the region znode as 
 Do this so master doesn't timeout this region-in-transition..
 However, on the usual test, this makes the assignment time of 1500 regions 
 goes from 70s to 100s, that is, we're 50% slower because of this.
 More generally, ZooKeper commits to disk all the data update, and this takes 
 time. Using it to provide a keep alive seems overkill. At the very list, it 
 could be made asynchronous.
 I'm not sure how necessary these updates are required (I need to go deeper in 
 the internal, feedback welcome), but it seems very important to optimize 
 this... The trival fix would be to make this optional.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7247) Assignment performances decreased by 50% because of regionserver.OpenRegionHandler#tickleOpening

2013-04-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13623536#comment-13623536
 ] 

Hadoop QA commented on HBASE-7247:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12577192/7247.v2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.client.TestHTableMultiplexer

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5148//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5148//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5148//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5148//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5148//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5148//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5148//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5148//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5148//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5148//console

This message is automatically generated.

 Assignment performances decreased by 50% because of 
 regionserver.OpenRegionHandler#tickleOpening
 

 Key: HBASE-7247
 URL: https://issues.apache.org/jira/browse/HBASE-7247
 Project: HBase
  Issue Type: Improvement
  Components: master, Region Assignment, regionserver
Affects Versions: 0.96.0
Reporter: Nicolas Liochon
Assignee: Nicolas Liochon
 Fix For: 0.95.1

 Attachments: 7247.v1.patch, 7247.v2.patch


 The regionserver.OpenRegionHandler#tickleOpening updates the region znode as 
 Do this so master doesn't timeout this region-in-transition..
 However, on the usual test, this makes the assignment time of 1500 regions 
 goes from 70s to 100s, that is, we're 50% slower because of this.
 More generally, ZooKeper commits to disk all the data update, and this takes 
 time. Using it to provide a keep alive seems overkill. At the very list, it 
 could be made asynchronous.
 I'm not sure how necessary these updates are required (I need to go deeper in 
 the internal, feedback welcome), but it seems very important to optimize 
 this... The trival fix would be to make this optional.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7247) Assignment performances decreased by 50% because of regionserver.OpenRegionHandler#tickleOpening

2013-04-04 Thread Nicolas Liochon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13622595#comment-13622595
 ] 

Nicolas Liochon commented on HBASE-7247:


I'm doing some tests on trunk on recovery.

Test is:
 - 1000 empty regions on one RS
 - on another computer, all services are running: master, ZK, second RS.
 - Stop (cleanly) the first RS.

Recovery (in this case assignment only) takes 2 minutes.
All the time seems to be spent in writing/reading to ZK.

We have ~3000 calls to ZK#transitionNode, including 1000 for tickleOpening.

Each transition is:
 sync
 readData
 setData


Commenting the sync makes no difference (it could be an effect of the test env).
removing tickleOpening brings the result to 80 seconds (vs. 120s)
Checking but not writing in tickleOpening puts us around 85s. I will do a 
proper patch with this.

 Assignment performances decreased by 50% because of 
 regionserver.OpenRegionHandler#tickleOpening
 

 Key: HBASE-7247
 URL: https://issues.apache.org/jira/browse/HBASE-7247
 Project: HBase
  Issue Type: Improvement
  Components: master, Region Assignment, regionserver
Affects Versions: 0.96.0
Reporter: Nicolas Liochon
Assignee: Nicolas Liochon
 Fix For: 0.95.1

 Attachments: 7247.v1.patch


 The regionserver.OpenRegionHandler#tickleOpening updates the region znode as 
 Do this so master doesn't timeout this region-in-transition..
 However, on the usual test, this makes the assignment time of 1500 regions 
 goes from 70s to 100s, that is, we're 50% slower because of this.
 More generally, ZooKeper commits to disk all the data update, and this takes 
 time. Using it to provide a keep alive seems overkill. At the very list, it 
 could be made asynchronous.
 I'm not sure how necessary these updates are required (I need to go deeper in 
 the internal, feedback welcome), but it seems very important to optimize 
 this... The trival fix would be to make this optional.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7247) Assignment performances decreased by 50% because of regionserver.OpenRegionHandler#tickleOpening

2013-03-11 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13598749#comment-13598749
 ] 

nkeywal commented on HBASE-7247:


TimeOutManagement it now optional and deactivated by default. I will redo the 
measures.

 Assignment performances decreased by 50% because of 
 regionserver.OpenRegionHandler#tickleOpening
 

 Key: HBASE-7247
 URL: https://issues.apache.org/jira/browse/HBASE-7247
 Project: HBase
  Issue Type: Improvement
  Components: master, Region Assignment, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.95.0

 Attachments: 7247.v1.patch


 The regionserver.OpenRegionHandler#tickleOpening updates the region znode as 
 Do this so master doesn't timeout this region-in-transition..
 However, on the usual test, this makes the assignment time of 1500 regions 
 goes from 70s to 100s, that is, we're 50% slower because of this.
 More generally, ZooKeper commits to disk all the data update, and this takes 
 time. Using it to provide a keep alive seems overkill. At the very list, it 
 could be made asynchronous.
 I'm not sure how necessary these updates are required (I need to go deeper in 
 the internal, feedback welcome), but it seems very important to optimize 
 this... The trival fix would be to make this optional.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7247) Assignment performances decreased by 50% because of regionserver.OpenRegionHandler#tickleOpening

2013-02-20 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582381#comment-13582381
 ] 

nkeywal commented on HBASE-7247:


I'm currently looking as this, now that the blocking issues for this has been 
fixed. Doable for 0.96 imho.

 Assignment performances decreased by 50% because of 
 regionserver.OpenRegionHandler#tickleOpening
 

 Key: HBASE-7247
 URL: https://issues.apache.org/jira/browse/HBASE-7247
 Project: HBase
  Issue Type: Improvement
  Components: master, Region Assignment, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.96.0

 Attachments: 7247.v1.patch


 The regionserver.OpenRegionHandler#tickleOpening updates the region znode as 
 Do this so master doesn't timeout this region-in-transition..
 However, on the usual test, this makes the assignment time of 1500 regions 
 goes from 70s to 100s, that is, we're 50% slower because of this.
 More generally, ZooKeper commits to disk all the data update, and this takes 
 time. Using it to provide a keep alive seems overkill. At the very list, it 
 could be made asynchronous.
 I'm not sure how necessary these updates are required (I need to go deeper in 
 the internal, feedback welcome), but it seems very important to optimize 
 this... The trival fix would be to make this optional.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7247) Assignment performances decreased by 50% because of regionserver.OpenRegionHandler#tickleOpening

2012-12-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13530687#comment-13530687
 ] 

Hadoop QA commented on HBASE-7247:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12556092/7247.v1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
104 warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 23 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.client.TestMultiParallel

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3508//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3508//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3508//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3508//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3508//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3508//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3508//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3508//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3508//console

This message is automatically generated.

 Assignment performances decreased by 50% because of 
 regionserver.OpenRegionHandler#tickleOpening
 

 Key: HBASE-7247
 URL: https://issues.apache.org/jira/browse/HBASE-7247
 Project: HBase
  Issue Type: Improvement
  Components: master, Region Assignment, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Critical
 Fix For: 0.96.0

 Attachments: 7247.v1.patch


 The regionserver.OpenRegionHandler#tickleOpening updates the region znode as 
 Do this so master doesn't timeout this region-in-transition..
 However, on the usual test, this makes the assignment time of 1500 regions 
 goes from 70s to 100s, that is, we're 50% slower because of this.
 More generally, ZooKeper commits to disk all the data update, and this takes 
 time. Using it to provide a keep alive seems overkill. At the very list, it 
 could be made asynchronous.
 I'm not sure how necessary these updates are required (I need to go deeper in 
 the internal, feedback welcome), but it seems very important to optimize 
 this... The trival fix would be to make this optional.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7247) Assignment performances decreased by 50% because of regionserver.OpenRegionHandler#tickleOpening

2012-12-10 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13527893#comment-13527893
 ] 

nkeywal commented on HBASE-7247:


From the comments on the mailing list and here:
- it makes sense
- Jimmy Xiang will list the exceptional case to make sure they are covered.

I will try to have things done in two or three steps:
- remove the code from the master
- remove the code from the region server

I will use the review board for this, in a distinct jira...

 Assignment performances decreased by 50% because of 
 regionserver.OpenRegionHandler#tickleOpening
 

 Key: HBASE-7247
 URL: https://issues.apache.org/jira/browse/HBASE-7247
 Project: HBase
  Issue Type: Improvement
  Components: master, Region Assignment, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Critical
 Fix For: 0.96.0

 Attachments: 7247.v1.patch


 The regionserver.OpenRegionHandler#tickleOpening updates the region znode as 
 Do this so master doesn't timeout this region-in-transition..
 However, on the usual test, this makes the assignment time of 1500 regions 
 goes from 70s to 100s, that is, we're 50% slower because of this.
 More generally, ZooKeper commits to disk all the data update, and this takes 
 time. Using it to provide a keep alive seems overkill. At the very list, it 
 could be made asynchronous.
 I'm not sure how necessary these updates are required (I need to go deeper in 
 the internal, feedback welcome), but it seems very important to optimize 
 this... The trival fix would be to make this optional.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7247) Assignment performances decreased by 50% because of regionserver.OpenRegionHandler#tickleOpening

2012-12-05 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13510363#comment-13510363
 ] 

nkeywal commented on HBASE-7247:


I will fix it with the other stuff so. Currently my patch does not work 
locally, but I'm making progress.

 Assignment performances decreased by 50% because of 
 regionserver.OpenRegionHandler#tickleOpening
 

 Key: HBASE-7247
 URL: https://issues.apache.org/jira/browse/HBASE-7247
 Project: HBase
  Issue Type: Improvement
  Components: master, Region Assignment, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Critical

 The regionserver.OpenRegionHandler#tickleOpening updates the region znode as 
 Do this so master doesn't timeout this region-in-transition..
 However, on the usual test, this makes the assignment time of 1500 regions 
 goes from 70s to 100s, that is, we're 50% slower because of this.
 More generally, ZooKeper commits to disk all the data update, and this takes 
 time. Using it to provide a keep alive seems overkill. At the very list, it 
 could be made asynchronous.
 I'm not sure how necessary these updates are required (I need to go deeper in 
 the internal, feedback welcome), but it seems very important to optimize 
 this... The trival fix would be to make this optional.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7247) Assignment performances decreased by 50% because of regionserver.OpenRegionHandler#tickleOpening

2012-12-05 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13510389#comment-13510389
 ] 

nkeywal commented on HBASE-7247:


The issue comes from mixing asynchronous writes with synchronous writes  
reads. My patch is now working but adds some complexity because of the added 
resynchronisation. I'm gonna do simpler. The ideal solution would be to be full 
async for the region opening (and I'm not gonna do that here :-) )

 Assignment performances decreased by 50% because of 
 regionserver.OpenRegionHandler#tickleOpening
 

 Key: HBASE-7247
 URL: https://issues.apache.org/jira/browse/HBASE-7247
 Project: HBase
  Issue Type: Improvement
  Components: master, Region Assignment, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Critical

 The regionserver.OpenRegionHandler#tickleOpening updates the region znode as 
 Do this so master doesn't timeout this region-in-transition..
 However, on the usual test, this makes the assignment time of 1500 regions 
 goes from 70s to 100s, that is, we're 50% slower because of this.
 More generally, ZooKeper commits to disk all the data update, and this takes 
 time. Using it to provide a keep alive seems overkill. At the very list, it 
 could be made asynchronous.
 I'm not sure how necessary these updates are required (I need to go deeper in 
 the internal, feedback welcome), but it seems very important to optimize 
 this... The trival fix would be to make this optional.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7247) Assignment performances decreased by 50% because of regionserver.OpenRegionHandler#tickleOpening

2012-12-05 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13510452#comment-13510452
 ] 

nkeywal commented on HBASE-7247:


The patch is getting thinner and thinner...
Now I just check that the znode was not updated recently before writing it 
again. On the test, it brings half of was I have it I remove totally 
tickleOpening.

There is something there as well, in  OpenRegionHandler#process()

{code}
  boolean failed = true;
  if (tickleOpening(post_region_open)) {
if (updateMeta(region)) {
  failed = false;
}
  }
  if (failed || this.server.isStopped() ||
{code}

This 'tickleOpening' is called whatever the time spent previously, before 
updating meta. If we remove it completely, we save a sync.


 Assignment performances decreased by 50% because of 
 regionserver.OpenRegionHandler#tickleOpening
 

 Key: HBASE-7247
 URL: https://issues.apache.org/jira/browse/HBASE-7247
 Project: HBase
  Issue Type: Improvement
  Components: master, Region Assignment, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Critical
 Fix For: 0.96.0

 Attachments: 7247.v1.patch


 The regionserver.OpenRegionHandler#tickleOpening updates the region znode as 
 Do this so master doesn't timeout this region-in-transition..
 However, on the usual test, this makes the assignment time of 1500 regions 
 goes from 70s to 100s, that is, we're 50% slower because of this.
 More generally, ZooKeper commits to disk all the data update, and this takes 
 time. Using it to provide a keep alive seems overkill. At the very list, it 
 could be made asynchronous.
 I'm not sure how necessary these updates are required (I need to go deeper in 
 the internal, feedback welcome), but it seems very important to optimize 
 this... The trival fix would be to make this optional.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7247) Assignment performances decreased by 50% because of regionserver.OpenRegionHandler#tickleOpening

2012-12-05 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13510622#comment-13510622
 ] 

stack commented on HBASE-7247:
--

Your approach sounds good nkeywal.  No harm writing less but adding the read in 
case we indeed have lost ownership.  It might mean it takes a bit longer to 
realize we've lost ownership but that should be fine.

On the 'post_region_open', IIRC, a bunch of these tickleOpenings were added 
because we saw issues... in this case, an update of .META. that went in though 
we'd lost ownership of the region.

Stepping back (after looking at code), could we drop the notion that a master 
can intercede and assign a region elsewhere because it is proceeding too slow 
on a particular region in the name of simplifying the region open handling 
interaction?  There would be less noise in the logs and less states to deal 
with.

If we did this, I'd think that we'd want the regionserver to do the initial 
move of the znode from OPENING to OPENING to establish ownership (elsewhere I 
have petitioned that the regionserver should set the OPENING state, and not the 
master -- master should set the state to PENDING_OPEN in the znode rather than 
just in master memory -- as a means of cleanly denoting the regionservers' 
assumption of region ownership).  Then the next transition would be from 
OPENING to OPEN or to FAILED_OPEN.  And that would be it.  Master would just 
presume that the only reason to intercede is when the regionserver loses its 
lease in zk.  We'd drop tickling OPENING so master knows we are progressing on 
a region open -- it would just presume regionserver is making progress and that 
it will kill itself if it can't get to HDFS, etc.  We'd also be dropping 
regionserver-side checks that it still owns a region just before it goes to 
update meta (Could change the meta operation to be a check and put so we didn't 
have to go to zk just before meta edit -- the bit of code you quote above 
nkeywal?).

Just putting it out there.

On patch:


Do we fail the open if the following break happens?

{code}
 tickleOpening = tickleOpening(post_open_deploy);
+if (!tickleOpening) {
+  break;
+}
{code}

We have to call the +zkw.sync(node); ?  We always did that?  We are doing 
the sync just to read the old znode value?  Do we have to?  Could we operate w/ 
stale read?

Are you removing this:

-  HRegion region = this.rsServices.getFromOnlineRegions(encodedName);

If so, why or have you moved the check later?


 Assignment performances decreased by 50% because of 
 regionserver.OpenRegionHandler#tickleOpening
 

 Key: HBASE-7247
 URL: https://issues.apache.org/jira/browse/HBASE-7247
 Project: HBase
  Issue Type: Improvement
  Components: master, Region Assignment, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Critical
 Fix For: 0.96.0

 Attachments: 7247.v1.patch


 The regionserver.OpenRegionHandler#tickleOpening updates the region znode as 
 Do this so master doesn't timeout this region-in-transition..
 However, on the usual test, this makes the assignment time of 1500 regions 
 goes from 70s to 100s, that is, we're 50% slower because of this.
 More generally, ZooKeper commits to disk all the data update, and this takes 
 time. Using it to provide a keep alive seems overkill. At the very list, it 
 could be made asynchronous.
 I'm not sure how necessary these updates are required (I need to go deeper in 
 the internal, feedback welcome), but it seems very important to optimize 
 this... The trival fix would be to make this optional.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7247) Assignment performances decreased by 50% because of regionserver.OpenRegionHandler#tickleOpening

2012-12-05 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13510697#comment-13510697
 ] 

nkeywal commented on HBASE-7247:


bq. On the 'post_region_open', IIRC, a bunch of these tickleOpenings were added 
because we saw issues... in this case, an update of .META. that went in though 
we'd lost ownership of the region.
Implicitly, it means we still have a race condition here, just that the 
probability is quite low.

bq. Stepping back (after looking at code), could we drop the notion that a 
master can intercede and assign a region elsewhere because it is proceeding too 
slow on a particular region in the name of simplifying the region open handling 
interaction? There would be less noise in the logs and less states to deal with.
It would be a huge simplification imho. It's worth trying, I would say. It 
actually makes sense to do it now, because once the current trunk code will be 
production proven, touching it will be scarier.

bq. Do we fail the open if the following break happens?
Yes, because updateMeta will return false.

bq. We have to call the + zkw.sync(node); ? We always did that? We are doing 
the sync just to read the old znode value? Do we have to? Could we operate w/ 
stale read?
Well, we want to be sure that no one else wrote anything. It's often overkill, 
because we're going to write the znode immediately after, so the sync will 
occur anyway during the write, as we check the versions during the write. And 
it's expensive. So there is likely some room for improvement here as well 
actually.


bq. HRegion region = this.rsServices.getFromOnlineRegions(encodedName);
We don't do anything here actually. We do a get, if the region is in the list, 
'region' will not be null and that's it. This variable is set later but not 
read in between. I can raise an error if it's in the list? Hopefully it won't 
break anything (famous last words)



 Assignment performances decreased by 50% because of 
 regionserver.OpenRegionHandler#tickleOpening
 

 Key: HBASE-7247
 URL: https://issues.apache.org/jira/browse/HBASE-7247
 Project: HBase
  Issue Type: Improvement
  Components: master, Region Assignment, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Critical
 Fix For: 0.96.0

 Attachments: 7247.v1.patch


 The regionserver.OpenRegionHandler#tickleOpening updates the region znode as 
 Do this so master doesn't timeout this region-in-transition..
 However, on the usual test, this makes the assignment time of 1500 regions 
 goes from 70s to 100s, that is, we're 50% slower because of this.
 More generally, ZooKeper commits to disk all the data update, and this takes 
 time. Using it to provide a keep alive seems overkill. At the very list, it 
 could be made asynchronous.
 I'm not sure how necessary these updates are required (I need to go deeper in 
 the internal, feedback welcome), but it seems very important to optimize 
 this... The trival fix would be to make this optional.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7247) Assignment performances decreased by 50% because of regionserver.OpenRegionHandler#tickleOpening

2012-12-05 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13510712#comment-13510712
 ] 

stack commented on HBASE-7247:
--

bq. Implicitly, it means we still have a race condition here, just that the 
probability is quite low.

Yeah.  By-product of our keeping state across multiple systems (up in zk and 
then some state in .meta.).  We could change this to a checkAndPut.  Read 
.META. at start of the opening or have master pass over the .META. timestamp or 
something key to .META. and we'd use it doing checkAndSet into .META. table... 
would be more strict than this updating zk.

bq. It would be a huge simplification imho. It's worth trying, I would say. It 
actually makes sense to do it now, because once the current trunk code will be 
production proven, touching it will be scarier.

I'd go along.  We should discuss out on dev first.  I have short-term memory.  
I'm currently of the opinion that this expensive facility of master failing an 
open because it has been taking too long on a particular regionserver has been 
of no use -- worse, it has only caused headache -- but I may be just not 
remembering and others out on dev list will have better recall than I.



 Assignment performances decreased by 50% because of 
 regionserver.OpenRegionHandler#tickleOpening
 

 Key: HBASE-7247
 URL: https://issues.apache.org/jira/browse/HBASE-7247
 Project: HBase
  Issue Type: Improvement
  Components: master, Region Assignment, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Critical
 Fix For: 0.96.0

 Attachments: 7247.v1.patch


 The regionserver.OpenRegionHandler#tickleOpening updates the region znode as 
 Do this so master doesn't timeout this region-in-transition..
 However, on the usual test, this makes the assignment time of 1500 regions 
 goes from 70s to 100s, that is, we're 50% slower because of this.
 More generally, ZooKeper commits to disk all the data update, and this takes 
 time. Using it to provide a keep alive seems overkill. At the very list, it 
 could be made asynchronous.
 I'm not sure how necessary these updates are required (I need to go deeper in 
 the internal, feedback welcome), but it seems very important to optimize 
 this... The trival fix would be to make this optional.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7247) Assignment performances decreased by 50% because of regionserver.OpenRegionHandler#tickleOpening

2012-12-05 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13511034#comment-13511034
 ] 

Andrew Purtell commented on HBASE-7247:
---

{quote}
bq. Stepping back (after looking at code), could we drop the notion that a 
master can intercede and assign a region elsewhere because it is proceeding too 
slow on a particular region in the name of simplifying the region open handling 
interaction? There would be less noise in the logs and less states to deal with.
It would be a huge simplification imho. It's worth trying, I would say. It 
actually makes sense to do it now, because once the current trunk code will be 
production proven, touching it will be scarier.
{quote}

+1


 Assignment performances decreased by 50% because of 
 regionserver.OpenRegionHandler#tickleOpening
 

 Key: HBASE-7247
 URL: https://issues.apache.org/jira/browse/HBASE-7247
 Project: HBase
  Issue Type: Improvement
  Components: master, Region Assignment, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Critical
 Fix For: 0.96.0

 Attachments: 7247.v1.patch


 The regionserver.OpenRegionHandler#tickleOpening updates the region znode as 
 Do this so master doesn't timeout this region-in-transition..
 However, on the usual test, this makes the assignment time of 1500 regions 
 goes from 70s to 100s, that is, we're 50% slower because of this.
 More generally, ZooKeper commits to disk all the data update, and this takes 
 time. Using it to provide a keep alive seems overkill. At the very list, it 
 could be made asynchronous.
 I'm not sure how necessary these updates are required (I need to go deeper in 
 the internal, feedback welcome), but it seems very important to optimize 
 this... The trival fix would be to make this optional.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7247) Assignment performances decreased by 50% because of regionserver.OpenRegionHandler#tickleOpening

2012-12-04 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13509789#comment-13509789
 ] 

nkeywal commented on HBASE-7247:


bq. Do we not have this currently? If someone else changes the znode, we don't 
notice? We only notice when we go to update the znode?


I haven't found where where we do it. The result of tickleOpening is actually 
often ignored as well, or at least ignored for a long time. 
for example, in OpenRegionHandler#updateMeta(final HRegion r), we have 

{code}
tickleOpening = true(
while (/* condition, but not on tickleOpening */){
  // do something
  tickleOpening = tickleOpening();
}

return something  tickleOpening;
{code}

In theory, we should break the loop when tickleOpening becomes false.


While the way it's written, it seems that we can have a failure once, then a 
success.
Basically, it s eems that tickleOpening is not always used as a check.

There is about the same code in OpenRegionHandler#process (i.e. an error in 
tickleOpening does not interrupt the process). 

 Assignment performances decreased by 50% because of 
 regionserver.OpenRegionHandler#tickleOpening
 

 Key: HBASE-7247
 URL: https://issues.apache.org/jira/browse/HBASE-7247
 Project: HBase
  Issue Type: Improvement
  Components: master, Region Assignment, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Critical

 The regionserver.OpenRegionHandler#tickleOpening updates the region znode as 
 Do this so master doesn't timeout this region-in-transition..
 However, on the usual test, this makes the assignment time of 1500 regions 
 goes from 70s to 100s, that is, we're 50% slower because of this.
 More generally, ZooKeper commits to disk all the data update, and this takes 
 time. Using it to provide a keep alive seems overkill. At the very list, it 
 could be made asynchronous.
 I'm not sure how necessary these updates are required (I need to go deeper in 
 the internal, feedback welcome), but it seems very important to optimize 
 this... The trival fix would be to make this optional.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7247) Assignment performances decreased by 50% because of regionserver.OpenRegionHandler#tickleOpening

2012-12-04 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13509811#comment-13509811
 ] 

ramkrishna.s.vasudevan commented on HBASE-7247:
---

Yes N.. You are right ...

 Assignment performances decreased by 50% because of 
 regionserver.OpenRegionHandler#tickleOpening
 

 Key: HBASE-7247
 URL: https://issues.apache.org/jira/browse/HBASE-7247
 Project: HBase
  Issue Type: Improvement
  Components: master, Region Assignment, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Critical

 The regionserver.OpenRegionHandler#tickleOpening updates the region znode as 
 Do this so master doesn't timeout this region-in-transition..
 However, on the usual test, this makes the assignment time of 1500 regions 
 goes from 70s to 100s, that is, we're 50% slower because of this.
 More generally, ZooKeper commits to disk all the data update, and this takes 
 time. Using it to provide a keep alive seems overkill. At the very list, it 
 could be made asynchronous.
 I'm not sure how necessary these updates are required (I need to go deeper in 
 the internal, feedback welcome), but it seems very important to optimize 
 this... The trival fix would be to make this optional.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7247) Assignment performances decreased by 50% because of regionserver.OpenRegionHandler#tickleOpening

2012-12-04 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13509968#comment-13509968
 ] 

nkeywal commented on HBASE-7247:


I'm a little bit scared with the watchers: I would not like to have race 
conditions (if the region is opened twice for example). I'm trying something 
with only the async part... 

 Assignment performances decreased by 50% because of 
 regionserver.OpenRegionHandler#tickleOpening
 

 Key: HBASE-7247
 URL: https://issues.apache.org/jira/browse/HBASE-7247
 Project: HBase
  Issue Type: Improvement
  Components: master, Region Assignment, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Critical

 The regionserver.OpenRegionHandler#tickleOpening updates the region znode as 
 Do this so master doesn't timeout this region-in-transition..
 However, on the usual test, this makes the assignment time of 1500 regions 
 goes from 70s to 100s, that is, we're 50% slower because of this.
 More generally, ZooKeper commits to disk all the data update, and this takes 
 time. Using it to provide a keep alive seems overkill. At the very list, it 
 could be made asynchronous.
 I'm not sure how necessary these updates are required (I need to go deeper in 
 the internal, feedback welcome), but it seems very important to optimize 
 this... The trival fix would be to make this optional.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7247) Assignment performances decreased by 50% because of regionserver.OpenRegionHandler#tickleOpening

2012-12-04 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13510135#comment-13510135
 ] 

stack commented on HBASE-7247:
--

bq. In theory, we should break the loop when tickleOpening becomes false.

Yes.  That is a bug.

 Assignment performances decreased by 50% because of 
 regionserver.OpenRegionHandler#tickleOpening
 

 Key: HBASE-7247
 URL: https://issues.apache.org/jira/browse/HBASE-7247
 Project: HBase
  Issue Type: Improvement
  Components: master, Region Assignment, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Critical

 The regionserver.OpenRegionHandler#tickleOpening updates the region znode as 
 Do this so master doesn't timeout this region-in-transition..
 However, on the usual test, this makes the assignment time of 1500 regions 
 goes from 70s to 100s, that is, we're 50% slower because of this.
 More generally, ZooKeper commits to disk all the data update, and this takes 
 time. Using it to provide a keep alive seems overkill. At the very list, it 
 could be made asynchronous.
 I'm not sure how necessary these updates are required (I need to go deeper in 
 the internal, feedback welcome), but it seems very important to optimize 
 this... The trival fix would be to make this optional.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7247) Assignment performances decreased by 50% because of regionserver.OpenRegionHandler#tickleOpening

2012-12-03 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13508916#comment-13508916
 ] 

nkeywal commented on HBASE-7247:


An possible solution could be:
- use a watcher to see if someone else is updating the znode
- use an asynchronous write to update the znode when we want to say to the 
master we're still there.

It mostly as of today, except that we suppressed the synchronous write on the 
regionserver path.
Pros: should be a small enough change.
Cons: we're still loading ZK (and the master as a side effect).

What do you think?

 Assignment performances decreased by 50% because of 
 regionserver.OpenRegionHandler#tickleOpening
 

 Key: HBASE-7247
 URL: https://issues.apache.org/jira/browse/HBASE-7247
 Project: HBase
  Issue Type: Improvement
  Components: master, Region Assignment, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Critical

 The regionserver.OpenRegionHandler#tickleOpening updates the region znode as 
 Do this so master doesn't timeout this region-in-transition..
 However, on the usual test, this makes the assignment time of 1500 regions 
 goes from 70s to 100s, that is, we're 50% slower because of this.
 More generally, ZooKeper commits to disk all the data update, and this takes 
 time. Using it to provide a keep alive seems overkill. At the very list, it 
 could be made asynchronous.
 I'm not sure how necessary these updates are required (I need to go deeper in 
 the internal, feedback welcome), but it seems very important to optimize 
 this... The trival fix would be to make this optional.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7247) Assignment performances decreased by 50% because of regionserver.OpenRegionHandler#tickleOpening

2012-12-03 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13508939#comment-13508939
 ] 

stack commented on HBASE-7247:
--

bq. use a watcher to see if someone else is updating the znode

Do we not have this currently?  If someone else changes the znode, we don't 
notice?  We only notice when we go to update the znode?

Should we list how many znode updates happen for a region open?

Async'ing the OPENINGs so master gets a tickle that the OPEN is progressing 
sounds fine... We don't actually move any files around on OPEN so its not a 
'problem' if not the owner, really... Its only later after compaction, etc., 
that the damage is done... that is what we should prevent happening for sure.

 Assignment performances decreased by 50% because of 
 regionserver.OpenRegionHandler#tickleOpening
 

 Key: HBASE-7247
 URL: https://issues.apache.org/jira/browse/HBASE-7247
 Project: HBase
  Issue Type: Improvement
  Components: master, Region Assignment, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Critical

 The regionserver.OpenRegionHandler#tickleOpening updates the region znode as 
 Do this so master doesn't timeout this region-in-transition..
 However, on the usual test, this makes the assignment time of 1500 regions 
 goes from 70s to 100s, that is, we're 50% slower because of this.
 More generally, ZooKeper commits to disk all the data update, and this takes 
 time. Using it to provide a keep alive seems overkill. At the very list, it 
 could be made asynchronous.
 I'm not sure how necessary these updates are required (I need to go deeper in 
 the internal, feedback welcome), but it seems very important to optimize 
 this... The trival fix would be to make this optional.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7247) Assignment performances decreased by 50% because of regionserver.OpenRegionHandler#tickleOpening

2012-11-30 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13507536#comment-13507536
 ] 

stack commented on HBASE-7247:
--

Moving znode from OPENING to OPENING or CLOSING to CLOSING ensuring the 
sequence numbers are as we expect is how we check we still have 'ownership' of 
the region before we go about making alterations in the filesystem.  The 
callback the master gets when we 'update' the znode is used to flag the master 
that the region server is still 'alive' and working on the opening so the 
master updates its opening timer when it gets the callback.

It is imperfect in that we could lose the lease between the check and the fs 
operation.

Alternatives would be to remove this mechanism and instead just rely on our 
getting a callback if the znode is taken from us?  This would happen in another 
thread.  Would have to be watching.  This would seem to coarser than what we 
currently have widening the window during which we could do fs operations on a 
region though we've lost ownership.  Master would not get notification that a 
region server is opening a region only its taking longer than usual.

Any other suggestions?

 Assignment performances decreased by 50% because of 
 regionserver.OpenRegionHandler#tickleOpening
 

 Key: HBASE-7247
 URL: https://issues.apache.org/jira/browse/HBASE-7247
 Project: HBase
  Issue Type: Improvement
  Components: master, Region Assignment, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Critical

 The regionserver.OpenRegionHandler#tickleOpening updates the region znode as 
 Do this so master doesn't timeout this region-in-transition..
 However, on the usual test, this makes the assignment time of 1500 regions 
 goes from 70s to 100s, that is, we're 50% slower because of this.
 More generally, ZooKeper commits to disk all the data update, and this takes 
 time. Using it to provide a keep alive seems overkill. At the very list, it 
 could be made asynchronous.
 I'm not sure how necessary these updates are required (I need to go deeper in 
 the internal, feedback welcome), but it seems very important to optimize 
 this... The trival fix would be to make this optional.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7247) Assignment performances decreased by 50% because of regionserver.OpenRegionHandler#tickleOpening

2012-11-30 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13507570#comment-13507570
 ] 

nkeywal commented on HBASE-7247:


For the regionserver side, the callback won't be very difficult to do I think: 
the callback would write a variable that would be tested by tickleOpening, and 
we're done. And as the test cost is cheap, we can test more often.

For master, hum... there are different options. May be the master could ask to 
the region server if it's ok? This would be done if there's no feedback after a 
while and if the regionserver is still alive from a zookeeper point if view? 
This would allow to test multiple regions simultaneously (all the regions of 
this regionserver).


 Assignment performances decreased by 50% because of 
 regionserver.OpenRegionHandler#tickleOpening
 

 Key: HBASE-7247
 URL: https://issues.apache.org/jira/browse/HBASE-7247
 Project: HBase
  Issue Type: Improvement
  Components: master, Region Assignment, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Critical

 The regionserver.OpenRegionHandler#tickleOpening updates the region znode as 
 Do this so master doesn't timeout this region-in-transition..
 However, on the usual test, this makes the assignment time of 1500 regions 
 goes from 70s to 100s, that is, we're 50% slower because of this.
 More generally, ZooKeper commits to disk all the data update, and this takes 
 time. Using it to provide a keep alive seems overkill. At the very list, it 
 could be made asynchronous.
 I'm not sure how necessary these updates are required (I need to go deeper in 
 the internal, feedback welcome), but it seems very important to optimize 
 this... The trival fix would be to make this optional.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7247) Assignment performances decreased by 50% because of regionserver.OpenRegionHandler#tickleOpening

2012-11-30 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13507598#comment-13507598
 ] 

stack commented on HBASE-7247:
--

bq. ...the callback would write a variable that would be tested by 
tickleOpening, and we're done. And as the test cost is cheap, we can test more 
often.

You mean you'd keep writing the znode, just async?

bq. May be the master could ask to the region server if it's ok? 

So master would ping after every region that is opening?  Would this be 
better/less costly than what we have now?

In past, before we moved assign to all-zk, on the region server heartbeat to 
the master, the RS would include the list of OPENING regions.

 Assignment performances decreased by 50% because of 
 regionserver.OpenRegionHandler#tickleOpening
 

 Key: HBASE-7247
 URL: https://issues.apache.org/jira/browse/HBASE-7247
 Project: HBase
  Issue Type: Improvement
  Components: master, Region Assignment, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Critical

 The regionserver.OpenRegionHandler#tickleOpening updates the region znode as 
 Do this so master doesn't timeout this region-in-transition..
 However, on the usual test, this makes the assignment time of 1500 regions 
 goes from 70s to 100s, that is, we're 50% slower because of this.
 More generally, ZooKeper commits to disk all the data update, and this takes 
 time. Using it to provide a keep alive seems overkill. At the very list, it 
 could be made asynchronous.
 I'm not sure how necessary these updates are required (I need to go deeper in 
 the internal, feedback welcome), but it seems very important to optimize 
 this... The trival fix would be to make this optional.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7247) Assignment performances decreased by 50% because of regionserver.OpenRegionHandler#tickleOpening

2012-11-30 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13507607#comment-13507607
 ] 

Jimmy Xiang commented on HBASE-7247:


If one region server is opening a lot of regions, we just need one handler to 
tickle the opening.  Master just needs to know the region server is still 
running so that it doesn't time out the assignment.

To me, this is an overkill. We can combine this logic and something else, for 
example, to detect if a regionserver is dead.

We used to have the 'owernership' issue as Stack mentioned.  Now, I think we 
are fine since AM should have a consistent view of region states.

 Assignment performances decreased by 50% because of 
 regionserver.OpenRegionHandler#tickleOpening
 

 Key: HBASE-7247
 URL: https://issues.apache.org/jira/browse/HBASE-7247
 Project: HBase
  Issue Type: Improvement
  Components: master, Region Assignment, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Critical

 The regionserver.OpenRegionHandler#tickleOpening updates the region znode as 
 Do this so master doesn't timeout this region-in-transition..
 However, on the usual test, this makes the assignment time of 1500 regions 
 goes from 70s to 100s, that is, we're 50% slower because of this.
 More generally, ZooKeper commits to disk all the data update, and this takes 
 time. Using it to provide a keep alive seems overkill. At the very list, it 
 could be made asynchronous.
 I'm not sure how necessary these updates are required (I need to go deeper in 
 the internal, feedback welcome), but it seems very important to optimize 
 this... The trival fix would be to make this optional.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7247) Assignment performances decreased by 50% because of regionserver.OpenRegionHandler#tickleOpening

2012-11-30 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13507621#comment-13507621
 ] 

stack commented on HBASE-7247:
--

bq. If one region server is opening a lot of regions, we just need one handler 
to tickle the opening. 

This would be a significant change for I'm-not-sure-what-benefit.  The zk 
transactions are region level/scoped and their handling is done at this level 
in open/close exec handlers.  Making it so RS does tracking and updating state 
for master to read regards CLOSING/OPENING would alter a bunch of code.

I'm all for a reexamination of base operations.  Stuff is this way because we 
would have issues where an open would stall for whatever reason... Master would 
intercept the open, take over the region and give it to someone else to open.  
More often than not, we'd just fail again for same reason on the new location 
but the odd time the new re-attempt would succeed.

We could give up reattempt and just let everything hinge on whether a region 
server has a zk lease or not and let ServerShutdownHandler do it all.  It'd be 
a pretty radical difference.  Simplify code but also, in an odd case, it might 
mean we'd fail recover a region (I don't have stats on this).

bq. We used to have the 'owernership' issue as Stack mentioned. Now, I think we 
are fine since AM should have a consistent view of region states.

This is a similar type of leap-in-the-dark (smile).  I love the notion that AM 
is now rock solid.  It may be given the work expended (it certainly is a 
million times better) but I'd like us to run w/ the new AM a while in a few 
productions before making this ruling.

Again, if we could undo the OPENING/CLOSING, etc., stuff would be 
cleaner/simpler (logs would be way less noisy)



 Assignment performances decreased by 50% because of 
 regionserver.OpenRegionHandler#tickleOpening
 

 Key: HBASE-7247
 URL: https://issues.apache.org/jira/browse/HBASE-7247
 Project: HBase
  Issue Type: Improvement
  Components: master, Region Assignment, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Critical

 The regionserver.OpenRegionHandler#tickleOpening updates the region znode as 
 Do this so master doesn't timeout this region-in-transition..
 However, on the usual test, this makes the assignment time of 1500 regions 
 goes from 70s to 100s, that is, we're 50% slower because of this.
 More generally, ZooKeper commits to disk all the data update, and this takes 
 time. Using it to provide a keep alive seems overkill. At the very list, it 
 could be made asynchronous.
 I'm not sure how necessary these updates are required (I need to go deeper in 
 the internal, feedback welcome), but it seems very important to optimize 
 this... The trival fix would be to make this optional.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira