thangTang commented on issue #1071: HBASE-23693 Split failure may cause region 
hole and data loss when use zk assign
URL: https://github.com/apache/hbase/pull/1071#issuecomment-576352379
 
 
   > This seems to solve the main issue, which is the parent and its interim 
daughters been wiped off. However, I'm wondering if the problem happens because 
we are setting parent state offline and split=true too soon in the split 
operation. What would you think, @thangTang ?
   > 
   > Also, is it possible to provide a UT to reproduce? Might be too complex, 
though, since it involves some sort of race condition.
   
   I think the direct reason for this problem is that the information in the 
meta table has not been updated after the daughter region is cleaned up. At 
this time, the sub daughter no longer exists and the master will also try to 
reassign the parent region, so updating the information in the meta table 
should not have a negative impact. As for whether to adjust the timing of 
setting the parent region state in the meta table, we may need to sort out the 
split process as a whole. After all, when ZK assign is used, split transaction 
is complex.
   
   I've considered the problem of unit testing, and it's difficult to mock this 
case --- I met this problem in an extreme scenario --- RS crash when split step 
after PONR but not entirely complete, no other RS is available, and the master 
switch occurs.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to