keith-turner commented on PR #2792:
URL: https://github.com/apache/accumulo/pull/2792#issuecomment-1249134891
> It feels like we're shooting in the dark, adding a sanity check against
a problem that is not well-defined.
I think the validation outlined in #1377 is very well defined: If table ids
w/ a certain state were seen in ZK before and after metadata scan then we must
see them in metadata OR something is really really wrong. The less the GC
trust that the persisted data it reads is correct and the more it tries to
verify what it can about that data the more robust GC will be. The GC already
does a good bit of checks on the metadata table as it reads it. Checking the
metadata table against ZK will make the verification more thorough.
In addition to the check outlined in #1377, another check was added in #2293
for the case where table ids were seen in metadata table and none were seen in
ZK. This situation can legitimately happen when the total number of tables
goes to/from zero concurrently with a GC cycle. This situation could also
happen when ZK read is silently failing and returning no data. There is no way
to distinguish between the two causes. If the system is actually working
correctly, then the legitimate case would be transient. We could defer this
second check and open an issue to do it another PR where we try to figure out
how to avoid the false positive or make the check less noisy at first (like
always skip GC cycle but log info at first and warn/error on subsequent cycles
if seen again). IMO it would be good to leave this second check as it is in
the current PR because it can always be removed in bug fix release.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]