[jira] [Commented] (HBASE-7263) Investigate more fine grained locking for checkAndPut/append/increment

2012-12-27 Thread shen guanpu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539865#comment-13539865
 ] 

shen guanpu commented on HBASE-7263:


hi Gregory Chanan 
Does your test just operate one rowkey?
How much will it be slower when you put different rowkey? as you mentioned ,you 
have to wait for MVCC on other rows.

 Investigate more fine grained locking for checkAndPut/append/increment
 --

 Key: HBASE-7263
 URL: https://issues.apache.org/jira/browse/HBASE-7263
 Project: HBase
  Issue Type: Improvement
  Components: Transactions/MVCC
Reporter: Gregory Chanan
Assignee: Gregory Chanan
Priority: Minor

 HBASE-7051 lists 3 options for fixing an ACID-violation wrt checkAndPut:
 {quote}
 1) Waiting for the MVCC to advance for read/updates: the downside is that you 
 have to wait for updates on other rows.
 2) Have an MVCC per-row (table configuration): this avoids the unnecessary 
 contention of 1)
 3) Transform the read/updates to write-only with rollup on read.. E.g. an 
 increment would just have the number of values to increment.
 {quote}
 HBASE-7051 and HBASE-4583 implement option #1.  The downside, as mentioned, 
 is that you have to wait for updates on other rows, since MVCC is per-row.
 Another option occurred to me that I think is worth investigating: rely on a 
 row-level read/write lock rather than MVCC.
 Here is pseudo-code for what exists today for read/updates like checkAndPut
 {code}
 (1)  Acquire RowLock
 (1a) BeginMVCC + Finish MVCC
 (2)  Begin MVCC
 (3)  Do work
 (4)  Release RowLock
 (5)  Append to WAL
 (6)  Finish MVCC
 {code}
 Write-only operations (e.g. puts) are the same, just without step 1a.
 Now, consider the following instead:
 {code}
 (1)  Acquire RowLock
 (1a) Grab+Release RowWriteLock (instead of BeginMVCC + Finish MVCC)
 (1b) Grab RowReadLock (new step!)
 (2)  Begin MVCC
 (3)  Do work
 (4)  Release RowLock
 (5)  Append to WAL
 (6)  Finish MVCC
 (7)  Release RowReadLock (new step!)
 {code}
 As before, write-only operations are the same, just without step 1a.
 The difference here is that writes grab a row-level read lock and hold it 
 until the MVCC is completed.  The nice property that this gives you is that 
 read/updates can tell when the MVCC is done on a per-row basis, because they 
 can just try to acquire the write-lock which will block until the MVCC is 
 competed for that row in step 7.
 There is overhead for acquiring the read lock that I need to measure, but it 
 should be small, since there will never be any blocking on acquiring the 
 row-level read lock.  This is because the read lock can only block if someone 
 else holds the write lock, but both the write and read lock are only acquired 
 under the row lock.
 I ran a quick test of this approach over a region (this directly interacts 
 with HRegion, so no client effects):
 - 30 threads
 - 5000 increments per thread
 - 30 columns per increment
 - Each increment uniformly distributed over 500,000 rows
 - 5 trials
 Better-Than-Theoretical-Max: (No locking or MVCC on step 1a): 10362.2 ms
 Today: 13950 ms
 The locking approach: 10877 ms
 So it looks like an improvement, at least wrt increment.  As mentioned, I 
 need to measure the overhead of acquiring the read lock for puts.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7263) Investigate more fine grained locking for checkAndPut/append/increment

2012-12-27 Thread shen guanpu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539873#comment-13539873
 ] 

shen guanpu commented on HBASE-7263:


HBASE-7051 and HBASE-4583 implement option #1. The downside, as mentioned, is 
that you have to wait for updates on other rows, since MVCC is per-row.

Do you mean it is not per-row for option 2( Have an MVCC per-row (table 
configuration): this avoids the unnecessary contention of 1))

 Investigate more fine grained locking for checkAndPut/append/increment
 --

 Key: HBASE-7263
 URL: https://issues.apache.org/jira/browse/HBASE-7263
 Project: HBase
  Issue Type: Improvement
  Components: Transactions/MVCC
Reporter: Gregory Chanan
Assignee: Gregory Chanan
Priority: Minor

 HBASE-7051 lists 3 options for fixing an ACID-violation wrt checkAndPut:
 {quote}
 1) Waiting for the MVCC to advance for read/updates: the downside is that you 
 have to wait for updates on other rows.
 2) Have an MVCC per-row (table configuration): this avoids the unnecessary 
 contention of 1)
 3) Transform the read/updates to write-only with rollup on read.. E.g. an 
 increment would just have the number of values to increment.
 {quote}
 HBASE-7051 and HBASE-4583 implement option #1.  The downside, as mentioned, 
 is that you have to wait for updates on other rows, since MVCC is per-row.
 Another option occurred to me that I think is worth investigating: rely on a 
 row-level read/write lock rather than MVCC.
 Here is pseudo-code for what exists today for read/updates like checkAndPut
 {code}
 (1)  Acquire RowLock
 (1a) BeginMVCC + Finish MVCC
 (2)  Begin MVCC
 (3)  Do work
 (4)  Release RowLock
 (5)  Append to WAL
 (6)  Finish MVCC
 {code}
 Write-only operations (e.g. puts) are the same, just without step 1a.
 Now, consider the following instead:
 {code}
 (1)  Acquire RowLock
 (1a) Grab+Release RowWriteLock (instead of BeginMVCC + Finish MVCC)
 (1b) Grab RowReadLock (new step!)
 (2)  Begin MVCC
 (3)  Do work
 (4)  Release RowLock
 (5)  Append to WAL
 (6)  Finish MVCC
 (7)  Release RowReadLock (new step!)
 {code}
 As before, write-only operations are the same, just without step 1a.
 The difference here is that writes grab a row-level read lock and hold it 
 until the MVCC is completed.  The nice property that this gives you is that 
 read/updates can tell when the MVCC is done on a per-row basis, because they 
 can just try to acquire the write-lock which will block until the MVCC is 
 competed for that row in step 7.
 There is overhead for acquiring the read lock that I need to measure, but it 
 should be small, since there will never be any blocking on acquiring the 
 row-level read lock.  This is because the read lock can only block if someone 
 else holds the write lock, but both the write and read lock are only acquired 
 under the row lock.
 I ran a quick test of this approach over a region (this directly interacts 
 with HRegion, so no client effects):
 - 30 threads
 - 5000 increments per thread
 - 30 columns per increment
 - Each increment uniformly distributed over 500,000 rows
 - 5 trials
 Better-Than-Theoretical-Max: (No locking or MVCC on step 1a): 10362.2 ms
 Today: 13950 ms
 The locking approach: 10877 ms
 So it looks like an improvement, at least wrt increment.  As mentioned, I 
 need to measure the overhead of acquiring the read lock for puts.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6465) Load balancer repeatedly close and open region in the same regionserver.

2012-07-27 Thread shen guanpu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423706#comment-13423706
 ] 

shen guanpu commented on HBASE-6465:


ok thanks
i am not quit catch the rule,sorry!

 Load balancer repeatedly close and open region in the same regionserver.
 

 Key: HBASE-6465
 URL: https://issues.apache.org/jira/browse/HBASE-6465
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver
Affects Versions: 0.94.0
Reporter: shen guanpu

 Through the master and regionserver log,I find load balancer repeatedly
 close and open region in the same regionserver(period in
 hbase.balancer.period ).
 Does this is a bug in load balancer and how can I dig into or avoid this?
 the hbase and hadoop version is
 HBase Version0.94.0, r1332822Hadoop Version0.20.2-cdh3u1,
 rbdafb1dbffd0d5f2fbc6ee022e1c8df6500fd638
 the following is a detail log about the same region
 trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7e956,
 and it repeats again and again.:
 2012-07-16 00:12:49,843 INFO org.apache.hadoop.hbase.master.HMaster: balance
 hri=trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7e956.,
 src=192.168.1.2,60020,1342017399608, dest=192.168.1.2,60020,1342002082592
 2012-07-16 00:12:49,843 DEBUG
 org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of
 region
 trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7e956.
 (offlining)
 2012-07-16 00:12:49,843 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign:
 master:6-0x4384d0a47f40068 Creating unassigned node for
 93caf5147d40f5dd4625e160e1b7e956 in a CLOSING state
 2012-07-16 00:12:49,845 DEBUG
 org.apache.hadoop.hbase.master.AssignmentManager: Sent CLOSE to
 192.168.1.2,60020,1342017399608 for region
 trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7e956.
 2012-07-16 00:12:50,555 DEBUG
 org.apache.hadoop.hbase.master.AssignmentManager: Handling
 transition=RS_ZK_REGION_CLOSED, server=192.168.1.2,60020,1342017399608,
 region=93caf5147d40f5dd4625e160e1b7e956
 2012-07-16 00:12:50,555 DEBUG
 org.apache.hadoop.hbase.master.handler.ClosedRegionHandler: Handling CLOSED
 event for 93caf5147d40f5dd4625e160e1b7e956
 2012-07-16 00:12:50,555 DEBUG
 org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE;
 was=trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7e956.
 state=CLOSED, ts=1342368770556, server=192.168.1.2,60020,1342017399608
 2012-07-16 00:12:50,555 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign:
 master:6-0x4384d0a47f40068 Creating (or updating) unassigned node for
 93caf5147d40f5dd4625e160e1b7e956 with OFFLINE state
 2012-07-16 00:12:50,558 DEBUG
 org.apache.hadoop.hbase.master.AssignmentManager: Handling
 transition=M_ZK_REGION_OFFLINE, server=10.75.18.34,6,1342017369575,
 region=93caf5147d40f5dd4625e160e1b7e956
 2012-07-16 00:12:50,558 DEBUG
 org.apache.hadoop.hbase.master.AssignmentManager: Found an existing plan
 for
 trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7e956.
 destination server is 192.168.1.2,60020,1342002082592
 2012-07-16 00:12:50,558 DEBUG
 org.apache.hadoop.hbase.master.AssignmentManager: Using pre-existing plan
 for region
 trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7e956.;
 plan=hri=trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7e956.,
 src=192.168.1.2,60020,1342017399608, dest=192.168.1.2,60020,1342002082592
 2012-07-16 00:12:50,558 DEBUG
 org.apache.hadoop.hbase.master.AssignmentManager: Assigning region
 trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7e956.
 to 192.168.1.2,60020,1342002082592
 2012-07-16 00:12:50,574 DEBUG
 org.apache.hadoop.hbase.master.AssignmentManager: Handling
 transition=RS_ZK_REGION_OPENING, server=192.168.1.2,60020,1342017399608,
 region=93caf5147d40f5dd4625e160e1b7e956
 2012-07-16 00:12:50,635 DEBUG
 org.apache.hadoop.hbase.master.AssignmentManager: Handling
 transition=RS_ZK_REGION_OPENING, server=192.168.1.2,60020,1342017399608,
 region=93caf5147d40f5dd4625e160e1b7e956
 2012-07-16 00:12:50,639 DEBUG
 org.apache.hadoop.hbase.master.AssignmentManager: Handling
 transition=RS_ZK_REGION_OPENED, server=192.168.1.2,60020,1342017399608,
 region=93caf5147d40f5dd4625e160e1b7e956
 2012-07-16 00:12:50,639 DEBUG
 org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Handling OPENED
 event for
 trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7e956.
 from 192.168.1.2,60020,1342017399608; deleting unassigned node
 2012-07-16 00:12:50,640 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign:
 master:6-0x4384d0a47f40068 Deleting existing unassigned node for
 93caf5147d40f5dd4625e160e1b7e956 

[jira] [Created] (HBASE-6465) Load balancer repeatedly close and open region in the same regionserver.

2012-07-26 Thread shen guanpu (JIRA)
shen guanpu created HBASE-6465:
--

 Summary: Load balancer repeatedly close and open region in the 
same regionserver.
 Key: HBASE-6465
 URL: https://issues.apache.org/jira/browse/HBASE-6465
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver
Affects Versions: 0.94.0
Reporter: shen guanpu


Through the master and regionserver log,I find load balancer repeatedly
close and open region in the same regionserver(period in
hbase.balancer.period ).
Does this is a bug in load balancer and how can I dig into or avoid this?


the hbase and hadoop version is
HBase Version0.94.0, r1332822Hadoop Version0.20.2-cdh3u1,
rbdafb1dbffd0d5f2fbc6ee022e1c8df6500fd638
the following is a detail log about the same region
trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7e956,
and it repeats again and again.:
2012-07-16 00:12:49,843 INFO org.apache.hadoop.hbase.master.HMaster: balance
hri=trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7e956.,
src=192.168.1.2,60020,1342017399608, dest=192.168.1.2,60020,1342002082592
2012-07-16 00:12:49,843 DEBUG
org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of
region
trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7e956.
(offlining)
2012-07-16 00:12:49,843 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign:
master:6-0x4384d0a47f40068 Creating unassigned node for
93caf5147d40f5dd4625e160e1b7e956 in a CLOSING state
2012-07-16 00:12:49,845 DEBUG
org.apache.hadoop.hbase.master.AssignmentManager: Sent CLOSE to
192.168.1.2,60020,1342017399608 for region
trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7e956.
2012-07-16 00:12:50,555 DEBUG
org.apache.hadoop.hbase.master.AssignmentManager: Handling
transition=RS_ZK_REGION_CLOSED, server=192.168.1.2,60020,1342017399608,
region=93caf5147d40f5dd4625e160e1b7e956
2012-07-16 00:12:50,555 DEBUG
org.apache.hadoop.hbase.master.handler.ClosedRegionHandler: Handling CLOSED
event for 93caf5147d40f5dd4625e160e1b7e956
2012-07-16 00:12:50,555 DEBUG
org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE;
was=trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7e956.
state=CLOSED, ts=1342368770556, server=192.168.1.2,60020,1342017399608
2012-07-16 00:12:50,555 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign:
master:6-0x4384d0a47f40068 Creating (or updating) unassigned node for
93caf5147d40f5dd4625e160e1b7e956 with OFFLINE state
2012-07-16 00:12:50,558 DEBUG
org.apache.hadoop.hbase.master.AssignmentManager: Handling
transition=M_ZK_REGION_OFFLINE, server=10.75.18.34,6,1342017369575,
region=93caf5147d40f5dd4625e160e1b7e956
2012-07-16 00:12:50,558 DEBUG
org.apache.hadoop.hbase.master.AssignmentManager: Found an existing plan
for
trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7e956.
destination server is 192.168.1.2,60020,1342002082592
2012-07-16 00:12:50,558 DEBUG
org.apache.hadoop.hbase.master.AssignmentManager: Using pre-existing plan
for region
trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7e956.;
plan=hri=trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7e956.,
src=192.168.1.2,60020,1342017399608, dest=192.168.1.2,60020,1342002082592
2012-07-16 00:12:50,558 DEBUG
org.apache.hadoop.hbase.master.AssignmentManager: Assigning region
trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7e956.
to 192.168.1.2,60020,1342002082592
2012-07-16 00:12:50,574 DEBUG
org.apache.hadoop.hbase.master.AssignmentManager: Handling
transition=RS_ZK_REGION_OPENING, server=192.168.1.2,60020,1342017399608,
region=93caf5147d40f5dd4625e160e1b7e956
2012-07-16 00:12:50,635 DEBUG
org.apache.hadoop.hbase.master.AssignmentManager: Handling
transition=RS_ZK_REGION_OPENING, server=192.168.1.2,60020,1342017399608,
region=93caf5147d40f5dd4625e160e1b7e956
2012-07-16 00:12:50,639 DEBUG
org.apache.hadoop.hbase.master.AssignmentManager: Handling
transition=RS_ZK_REGION_OPENED, server=192.168.1.2,60020,1342017399608,
region=93caf5147d40f5dd4625e160e1b7e956
2012-07-16 00:12:50,639 DEBUG
org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Handling OPENED
event for
trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7e956.
from 192.168.1.2,60020,1342017399608; deleting unassigned node
2012-07-16 00:12:50,640 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign:
master:6-0x4384d0a47f40068 Deleting existing unassigned node for
93caf5147d40f5dd4625e160e1b7e956 that is in expected state
RS_ZK_REGION_OPENED
2012-07-16 00:12:50,641 DEBUG
org.apache.hadoop.hbase.master.AssignmentManager: The znode of region
trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7e956.
has been deleted.
2012-07-16 00:12:50,641 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign:
master:6-0x4384d0a47f40068 Successfully deleted