[
https://issues.apache.org/jira/browse/HBASE-7551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13552552#comment-13552552
]
nkeywal commented on HBASE-7551:
--------------------------------
Nice catch.
bq. We should ensure that the transition from SPLITTING to SPLITTING should
happen only after the master has set the watch on the znode and we should be
sure of that.
It would add a dependency to master, but may be there is no other solution.
Something that would work if we accept this is:
- region server does a rpc calls to master asking 'may I start a split'
- when master is ok, it creates znode M_SPLITTING_REQUEST and watches it
- the region server can then start the split.
The advantage is that it's the same logic as an assignment: the master creates
the first node. But it creates a dependency to master. Other way would be to
change the master to make able to manage the case when it receives only the
last event. It's like if the master was dead when the split took place: the
master receives only the result...
> nodeChildrenChange event may happen after the transition to
> RS_ZK_REGION_SPLITTING in SplitTransaction causing the SPLIT event to be
> missed in the master side.
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: HBASE-7551
> URL: https://issues.apache.org/jira/browse/HBASE-7551
> Project: HBase
> Issue Type: Bug
> Components: master
> Affects Versions: 0.94.4
> Reporter: ramkrishna.s.vasudevan
> Assignee: ramkrishna.s.vasudevan
> Fix For: 0.96.0, 0.94.5
>
>
> This came from HBASE-7468.
> I got the issue. I am able to reproduce this
> See the logs
> {code}
> 2013-01-14 14:37:21,760 INFO [main] regionserver.SplitTransaction(216):
> Starting split of region
> testShouldClearRITWhenNodeFoundInSplittingState,,1358154439514.a9e57d09c58b3ef3b949d602232fb2c2.
> 2013-01-14 14:37:21,760 DEBUG [main] regionserver.SplitTransaction(871):
> regionserver:61665-0x13c384e4e4f0002 Creating ephemeral node for
> a9e57d09c58b3ef3b949d602232fb2c2 in SPLITTING state
> 2013-01-14 14:37:21,844 DEBUG [main] zookeeper.ZKAssign(757):
> regionserver:61665-0x13c384e4e4f0002 Attempting to transition node
> a9e57d09c58b3ef3b949d602232fb2c2 from RS_ZK_REGION_SPLITTING to
> RS_ZK_REGION_SPLITTING
> 2013-01-14 14:37:21,849 DEBUG [Thread-873-EventThread]
> zookeeper.ZooKeeperWatcher(277): master:62334-0x13c384e4e4f001b Received
> ZooKeeper Event, type=NodeChildrenChanged, state=SyncConnected,
> path=/hbase/unassigned
> 2013-01-14 14:37:21,853 DEBUG [main] zookeeper.ZKUtil(1565):
> regionserver:61665-0x13c384e4e4f0002 Retrieved 140 byte(s) of data from znode
> /hbase/unassigned/a9e57d09c58b3ef3b949d602232fb2c2;
> data=region=testShouldClearRITWhenNodeFoundInSplittingState,,1358154439514.a9e57d09c58b3ef3b949d602232fb2c2.,
> origin=Ram.Home,61665,1358154325430, state=RS_ZK_REGION_SPLITTING
> 2013-01-14 14:37:21,918 DEBUG [main] zookeeper.ZKAssign(820):
> regionserver:61665-0x13c384e4e4f0002 Successfully transitioned node
> a9e57d09c58b3ef3b949d602232fb2c2 from RS_ZK_REGION_SPLITTING to
> RS_ZK_REGION_SPLITTING
> 2013-01-14 14:37:21,919 DEBUG [Thread-873-EventThread] zookeeper.ZKUtil(417):
> master:62334-0x13c384e4e4f001b Set watcher on existing znode
> /hbase/unassigned/a9e57d09c58b3ef3b949d602232fb2c2
> {code}
> Here we can observe that the SPLITTING node was first created. Then we
> transit it to SPLITTING to SPLITTING so that AM can have the nodeDataChange
> event. But for the nodeDataChange event to happen first nodeChildrenChange
> event should happen so that the master can set a watcher on the node.
> Now when this hang happens, we can see that after the transition happens only
> then the watcher is set by nodeChildrenChange event and so the SPLITTING to
> SPLITTING event itself is missed or skipped.
> Ideally the nodeChildrenChange event iterates thro the list of new znodes on
> the /hbase/assignment nodes. And then creates a watcher on that. One reason
> could be there are more than one znode and so the watch setting operation
> takes time. The order of execution is different when we try running from
> eclipse and when we run mvn tests.
> My conclusion is that the testcase actually reveals the problem but the same
> can happen in any case where the SPLITTING event can get missed out. May be
> some of the SPLIT related bugs that were raised is due to this? Need to
> analyse.
> Any suggestions welcome. We should ensure that the transition from SPLITTING
> to SPLITTING should happen only after the master has set the watch on the
> znode and we should be sure of that.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira