ctubbsii commented on code in PR #2792:
URL: https://github.com/apache/accumulo/pull/2792#discussion_r977751451
##########
server/gc/src/main/java/org/apache/accumulo/gc/GarbageCollectionAlgorithm.java:
##########
@@ -216,6 +224,46 @@ private long
removeBlipCandidates(GarbageCollectionEnvironment gce,
return blipCount;
}
+ @VisibleForTesting
+ /**
+ * Double check no tables were missed during GC
+ */
+ protected void ensureAllTablesChecked(Set<TableId> tableIdsBefore,
Set<TableId> tableIdsSeen,
+ Set<TableId> tableIdsAfter) {
+
+ // if a table was added or deleted during this run, it is acceptable to not
+ // have seen those tables ids when scanning the metadata table. So get the
intersection
+ final Set<TableId> tableIdsMustHaveSeen = new HashSet<>(tableIdsBefore);
+ tableIdsMustHaveSeen.retainAll(tableIdsAfter);
+
+ if (tableIdsMustHaveSeen.isEmpty() && !tableIdsSeen.isEmpty()) {
+ throw new RuntimeException("Garbage collection will not proceed because "
+ + "table ids were seen in the metadata table and none were seen
Zookeeper. "
+ + "This can have two causes. First, total number of tables going
to/from "
+ + "zero during a GC cycle will cause this. Second, it could be
caused by "
+ + "corruption of the metadata table and/or Zookeeper. Only the
second cause "
+ + "is problematic, but there is no way to distinguish between the
two causes "
+ + "so this GC cycle will not proceed. The first cause should be
transient "
+ + "and one would not expect to see this message repeated in
subsequent GC cycles.");
+ }
Review Comment:
While creating the first table might not trigger this because of the lack of
gc candidate files, it's still true that users are likely creating tables soon
after initialization, and my point is that going from zero (user) tables to
non-zero is more likely to occur soon after initialization than at other times,
even if that initial activity would need to be more involved in order to
trigger this (such as creating a test table, flushing, deleting, then creating
the first table that the first GC run sees).
However, if this doesn't crash the GC, but merely triggers the current cycle
to be skipped, then my concern goes away.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]