keith-turner commented on PR #2792:
URL: https://github.com/apache/accumulo/pull/2792#issuecomment-1249134891

    >  It feels like we're shooting in the dark, adding a sanity check against 
a problem that is not well-defined.
   
   I think the validation outlined in #1377 is very well defined: If table ids 
w/ a certain state were seen in ZK before and after metadata scan then we must  
see them in metadata OR something is really really wrong.   The less the GC 
trust that the persisted data it reads is correct and the more it tries to 
verify what it can about that data the more robust GC will be.   The GC already 
does a good bit of checks on the metadata table as it reads it.  Checking the 
metadata table against ZK will make the verification more thorough.
   
   In addition to the check outlined in #1377, another check was added in #2293 
for the case where table ids were seen in metadata table and none were seen in 
ZK.   This situation can legitimately happen when the total number of tables 
goes to/from zero concurrently with a GC cycle.  This situation could also 
happen when ZK read is silently failing and returning no data.  There is no way 
to distinguish between the two causes.  If the system is actually working 
correctly, then the legitimate case would be transient.   We could defer this 
second check and open an issue to do it another PR where we try to figure out 
how to avoid the false positive or make the check less noisy at first (like 
always skip GC cycle but log info at first and warn/error on subsequent cycles 
if seen again).  IMO it would be good to leave this second check as it is in 
the current PR because it can always be removed in bug fix release.
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to