[ 
https://issues.apache.org/jira/browse/HBASE-11536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074201#comment-14074201
 ] 

Liu Shaohui commented on HBASE-11536:
-------------------------------------

[~stack]
Agree that the one time it fails, it'd be a high profile situation and we'd fix 
it.

Using versionOfOfflineNode has a potential risk that if we migrate an existing 
hbase cluster from a zk cluster from a new one, this method will not works.

After an dicussion with [~fenghh],  we agree with [~jxiang]'s suggestion: using 
the regionserver timestamp as the version. The deafult timeout of update meta 
is 100s which is far larger than time-skew between regionservers. And we have 
an alert if time-skew between hbase server and ntp server is larger than 100ms.

In a long term, i think the update for meta only be done in one process eg: 
HMaster, which decide which update is illegal according the state machine in it.

Another related problem is the META region location(for trunk). It's possible 
that the updates of META region locations are out of order when the opening of 
meta region is timeout.

Looking forward your suggestion. Thanks [~stack]



> Puts of region location to Meta may be out of order which causes inconsistent 
> of region location
> ------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-11536
>                 URL: https://issues.apache.org/jira/browse/HBASE-11536
>             Project: HBase
>          Issue Type: Bug
>          Components: Region Assignment
>            Reporter: Liu Shaohui
>            Priority: Critical
>         Attachments: 10.237.12.13.log, 10.237.12.15.log, 
> HBASE-11536-0.94-v1.diff
>
>
> In product hbase cluster, we found inconsistency of region location in the 
> meta table. Region cdfa2ed711bbdf054d9733a92fd43eb5 is onlined in 
> regionserver 10.237.12.13:11600 but the region location in Meta table is 
> 10.237.12.15:11600.
> This is because of the out-of-order puts for meta table.
> # HMaster try to assign the region to 10.237.12.15:11600.
> # RegionServer: 10.237.12.15:11600. During the opening the region, the put of 
> region location(10.237.12.15:11600) to meta table is timeout(60s) and the 
> htable retry for second time. (regionserver serving meta has got the request 
> of the put. The timeout is beause  ther is a bad disk in this regionserver 
> and sync of hlog is very slow. 
> )
> During the retry in htable, the OpenRegionHandler is timeout(100s) and the 
> PostOpenDeployTasksThread is interrupted. Through the htable is closed in the 
> MetaEditor finally, the share connection the htable used is not closed and 
> the call of put for meta table is on-flying in the connection. Assumed that 
> this on-flying call of put to meta is  named call A.
> # RegionServer: 10.237.12.15:11600. For the timeout of OpenRegionHandler, the 
> OpenRegionHandler marks the assign state of this region to FAILED_OPEN.
> # HMaster watchs this event of FAILED_OPEN and assigns the region to another 
> regionserver: 10.237.12.13:11600
> # RegionServer: 10.237.12.13:11600. This regionserver opens the region 
> successfully . Assumed that the put of region location(10.237.12.13:11600) to 
> meta table in this regionserver is named B.
> There is no order guarantee for call A and B. If call A is processed after 
> call B in regionserver serving meta region, the region location in meta table 
> will be wrong.
> From the raw scan of meta table we found:
> {code}
> scan '.META.', {RAW => true, LIMIT => 1, VERSIONS => 10, STARTROW => 
> 'xxx.adfa2ed711bbdf054d9733a92fd43eb5.'} 
> {code}
> {quote}
> xxx.adfa2ed711bbdf054d9733a92fd43eb5. column=info:server, 
> timestamp=1404885460553(=> Wed Jul 09 13:57:40 +0800 2014), 
> value=10.237.12.15:11600 --> Retry put from 10.237.12.15
> xxx.adfa2ed711bbdf054d9733a92fd43eb5. column=info:server, 
> timestamp=1404885456731(=> Wed Jul 09 13:57:36 +0800 2014), 
> value=10.237.12.13:11600 --> put from 10.237.12.13
>     
> xxx.adfa2ed711bbdf054d9733a92fd43eb5. column=info:server, 
> timestamp=1404885353122( Wed Jul 09 13:55:53 +0800 2014), 
> value=10.237.12.15:11600  --> First put from 10.237.12.15
> {quote}
> Related hbase log is attached in this issue and disscusions are welcomed.
> For there is no order guarantee for puts from different htables, one solution 
> for this issue is to give an increased id for each assignment of a region and 
> use this id as the timestamp of put of region location to meta table. The 
> region location with large assign id will be got by hbase clients.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to