[ https://issues.apache.org/jira/browse/HBASE-11536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14066137#comment-14066137 ]
Liu Shaohui commented on HBASE-11536: ------------------------------------- [~stack] [~jxiang] {quote} On region open, we pass a sequenceid (the zkid). In 0.94 it is this: "public RegionOpeningState openRegion(HRegionInfo region, int versionOfOfflineNode)". Use this as timestamp? {quote} Good suggestion. I will revisit the code about the versionOfOfflineNode in region assign and check if there are others potential risks. {quote} Will using the current timestamp in the client as the version in creating the Put object help, instead of using the default LATEST_TIMESTAMP, assuming the time-skew among these servers is small, and re-assignment takes a little time? {quote} I think we can't depend on that time-skew among these servers is small. > Puts of region location to Meta may be out of order which causes inconsistent > of region location > ------------------------------------------------------------------------------------------------ > > Key: HBASE-11536 > URL: https://issues.apache.org/jira/browse/HBASE-11536 > Project: HBase > Issue Type: Bug > Components: Region Assignment > Reporter: Liu Shaohui > Priority: Critical > Attachments: 10.237.12.13.log, 10.237.12.15.log > > > In product hbase cluster, we found inconsistency of region location in the > meta table. Region cdfa2ed711bbdf054d9733a92fd43eb5 is onlined in > regionserver 10.237.12.13:11600 but the region location in Meta table is > 10.237.12.15:11600. > This is because of the out-of-order puts for meta table. > # HMaster try to assign the region to 10.237.12.15:11600. > # RegionServer: 10.237.12.15:11600. During the opening the region, the put of > region location(10.237.12.15:11600) to meta table is timeout(60s) and the > htable retry for second time. (regionserver serving meta has got the request > of the put. The timeout is beause ther is a bad disk in this regionserver > and sync of hlog is very slow. > ) > During the retry in htable, the OpenRegionHandler is timeout(100s) and the > PostOpenDeployTasksThread is interrupted. Through the htable is closed in the > MetaEditor finally, the share connection the htable used is not closed and > the call of put for meta table is on-flying in the connection. Assumed that > this on-flying call of put to meta is named call A. > # RegionServer: 10.237.12.15:11600. For the timeout of OpenRegionHandler, the > OpenRegionHandler marks the assign state of this region to FAILED_OPEN. > # HMaster watchs this event of FAILED_OPEN and assigns the region to another > regionserver: 10.237.12.13:11600 > # RegionServer: 10.237.12.13:11600. This regionserver opens the region > successfully . Assumed that the put of region location(10.237.12.13:11600) to > meta table in this regionserver is named B. > There is no order guarantee for call A and B. If call A is processed after > call B in regionserver serving meta region, the region location in meta table > will be wrong. > From the raw scan of meta table we found: > {code} > scan '.META.', {RAW => true, LIMIT => 1, VERSIONS => 10, STARTROW => > 'xxx.adfa2ed711bbdf054d9733a92fd43eb5.'} > {code} > {quote} > xxx.adfa2ed711bbdf054d9733a92fd43eb5. column=info:server, > timestamp=1404885460553(=> Wed Jul 09 13:57:40 +0800 2014), > value=10.237.12.15:11600 --> Retry put from 10.237.12.15 > xxx.adfa2ed711bbdf054d9733a92fd43eb5. column=info:server, > timestamp=1404885456731(=> Wed Jul 09 13:57:36 +0800 2014), > value=10.237.12.13:11600 --> put from 10.237.12.13 > > xxx.adfa2ed711bbdf054d9733a92fd43eb5. column=info:server, > timestamp=1404885353122( Wed Jul 09 13:55:53 +0800 2014), > value=10.237.12.15:11600 --> First put from 10.237.12.15 > {quote} > Related hbase log is attached in this issue and disscusions are welcomed. > For there is no order guarantee for puts from different htables, one solution > for this issue is to give an increased id for each assignment of a region and > use this id as the timestamp of put of region location to meta table. The > region location with large assign id will be got by hbase clients. -- This message was sent by Atlassian JIRA (v6.2#6252)