[ 
https://issues.apache.org/jira/browse/HBASE-10136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Kyle Purtell resolved HBASE-10136.
-----------------------------------------
    Resolution: Not A Problem

> the table-lock of TableEventHandler is released too early because 
> reOpenAllRegions() is asynchronous
> ----------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-10136
>                 URL: https://issues.apache.org/jira/browse/HBASE-10136
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.98.0, 0.96.0, 0.99.0
>            Reporter: Aleksandr Shulman
>            Priority: Major
>              Labels: online_schema_change
>         Attachments: HBASE-10136-trunk.patch, HBASE-10136-v0.patch
>
>
> Expected behavior:
> With the introduction of the table-lock, a user can issue a request for a 
> snapshot of a table while that table is undergoing an online schema change 
> and expect that snapshot request to complete correctly. Also, the same is 
> true if a user issues a online schema change request while a snapshot attempt 
> is ongoing.
> Observed behavior:
> Snapshot attempts time out when there is an ongoing online schema change 
> because the table lock is not acquired by anyone else and the regions are 
> closed and opened during the snapshot. 
> TableEventHandler trace
> {code}
> // 1. client.addColumn() call from client...
> // 2. The operation is now on the master
> 2013-12-12 12:09:57,613 DEBUG [MASTER] lock.ZKInterProcessLockBase: Acquired 
> a lock for /hbase/table-lock/TestTable/write-master:452010000000001
> 2013-12-12 12:09:57,640 INFO  [MASTER] handler.TableEventHandler: Handling 
> table operation C_M_ADD_FAMILY on table TestTable
> 2013-12-12 12:09:57,685 INFO  [MASTER] master.MasterFileSystem: AddColumn. 
> Table = TestTable HCD = {NAME => 'x-1386850197327', DATA_BLOCK_ENCODING => 
> 'NONE',$
> 2013-12-12 12:09:57,693 INFO  [MASTER] handler.TableEventHandler: Bucketing 
> regions by region server...
> ...
> 2013-12-12 12:09:57,771 INFO  [MASTER] handler.TableEventHandler: Completed 
> table operation C_M_ADD_FAMILY on table TestTable
> 2013-12-12 12:09:57,771 DEBUG [MASTER] master.AssignmentManager: Starting 
> unassign of TestTable,,1386849056038.854b280$
> 2013-12-12 12:09:57,772 DEBUG [MASTER] lock.ZKInterProcessLockBase: Released 
> /hbase/table-lock/TestTable/write-master:452010000000001
> // 3. The Table*Handler operation is now completed, and the client notified 
> with "I'm done!"
> // 4. Now the BulkReopen is starting doing the reopen
> 2013-12-12 12:09:57,772 INFO  [MASTER] master.RegionStates: Transitioned 
> {854b280006aec464083778a5cb5f5456 state=OPEN,$
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to