[GitHub] [accumulo] ctubbsii commented on a diff in pull request #2792: closes #1377 - ensure all tables are checked ...

GitBox Tue, 20 Sep 2022 10:11:01 -0700


ctubbsii commented on code in PR #2792:
URL: https://github.com/apache/accumulo/pull/2792#discussion_r975587422



##########
server/gc/src/main/java/org/apache/accumulo/gc/GCRun.java:
##########
@@ -462,4 +507,39 @@ public long getErrorsStat() {
   public long getCandidatesStat() {
     return candidates;
   }
+
+  /**
+   * Return a set of all TableIDs in the
+   * {@link org.apache.accumulo.core.metadata.schema.Ample.DataLevel} for 
which we are considering
+   * deletes. When operating on DataLevel.USER this will return all user table 
ids in an ONLINE or
+   * OFFLINE state. When operating on DataLevel.METADATA this will return the 
table id for the
+   * accumulo.metadata table. When operating on DataLevel.ROOT this will 
return the table id for the
+   * accumulo.root table.
+   *
+   * @return The table ids
+   * @throws InterruptedException
+   *           if interrupted when calling ZooKeeper
+   */
+  @Override
+  public Set<TableId> getCandidateTableIDs() throws InterruptedException {
+    if (level == DataLevel.ROOT) {
+      return Set.of(RootTable.ID);
+    } else if (level == DataLevel.METADATA) {
+      return Collections.singleton(MetadataTable.ID);
+    } else if (level == DataLevel.USER) {
+      Set<TableId> tableIds = new HashSet<>();
+      getTableIDs().forEach((k, v) -> {
+        if (v == TableState.ONLINE || v == TableState.OFFLINE) {
+          // Don't return tables that are NEW, DELETING, or in an
+          // UNKNOWN state.
+          tableIds.add(k);
+        }
+      });
+      tableIds.remove(MetadataTable.ID);
+      tableIds.remove(RootTable.ID);
+      return tableIds;

Review Comment:
   Instead of forEach, this should use a predicate on the stream and use a 
stream collector. Something like the following, which is shorter and cleaner. 
Using forEach to populate a collection constructed prior to the loop is rarely 
best practice.
   
   ```suggestion
       return getTableIDs().entrySet().stream()
         .filter(e -> e.getValue() == TableState.ONLINE || e.getValue() == 
TableState.OFFLINE)
         .map(Entry::getKey)
         .filter(k -> !k.equals(MetadataTable.ID) && !k.equals(RootTable.ID))
         .collect(Collectors.toSet());
   ```
   



##########
server/gc/src/main/java/org/apache/accumulo/gc/GCRun.java:
##########
@@ -462,4 +507,39 @@ public long getErrorsStat() {
   public long getCandidatesStat() {
     return candidates;
   }
+
+  /**
+   * Return a set of all TableIDs in the
+   * {@link org.apache.accumulo.core.metadata.schema.Ample.DataLevel} for 
which we are considering
+   * deletes. When operating on DataLevel.USER this will return all user table 
ids in an ONLINE or
+   * OFFLINE state. When operating on DataLevel.METADATA this will return the 
table id for the
+   * accumulo.metadata table. When operating on DataLevel.ROOT this will 
return the table id for the
+   * accumulo.root table.
+   *
+   * @return The table ids
+   * @throws InterruptedException
+   *           if interrupted when calling ZooKeeper
+   */
+  @Override
+  public Set<TableId> getCandidateTableIDs() throws InterruptedException {
+    if (level == DataLevel.ROOT) {
+      return Set.of(RootTable.ID);
+    } else if (level == DataLevel.METADATA) {
+      return Collections.singleton(MetadataTable.ID);

Review Comment:
   Why does one return a Set.of(), and the other return a 
Collections.singleton()? Seems unnecessarily inconsistent.
   
   ```suggestion
         return Set.of(RootTable.ID);
       } else if (level == DataLevel.METADATA) {
         return Set.of(MetadataTable.ID);
   ```



##########
server/gc/src/test/java/org/apache/accumulo/gc/GarbageCollectionTest.java:
##########
@@ -80,8 +95,10 @@ public Stream<Reference> getReferences() {
     }
 
     @Override
-    public Set<TableId> getTableIDs() {
-      return tableIds;
+    public Map<TableId,TableState> getTableIDs() {
+      HashMap<TableId,TableState> results = new HashMap<>();
+      tableIds.forEach((t) -> results.put(t, TableState.ONLINE));
+      return results;

Review Comment:
   Again, forEach to populate a collection constructed prior to the loop is 
rarely best:
   
   ```suggestion
         return tableIds.stream().collect(Collectors.toMap(id -> id, id -> 
TableState.ONLINE));
   ```



##########
server/gc/src/main/java/org/apache/accumulo/gc/GCRun.java:
##########
@@ -462,4 +507,39 @@ public long getErrorsStat() {
   public long getCandidatesStat() {
     return candidates;
   }
+
+  /**
+   * Return a set of all TableIDs in the
+   * {@link org.apache.accumulo.core.metadata.schema.Ample.DataLevel} for 
which we are considering
+   * deletes. When operating on DataLevel.USER this will return all user table 
ids in an ONLINE or
+   * OFFLINE state. When operating on DataLevel.METADATA this will return the 
table id for the
+   * accumulo.metadata table. When operating on DataLevel.ROOT this will 
return the table id for the
+   * accumulo.root table.
+   *
+   * @return The table ids
+   * @throws InterruptedException
+   *           if interrupted when calling ZooKeeper
+   */
+  @Override
+  public Set<TableId> getCandidateTableIDs() throws InterruptedException {
+    if (level == DataLevel.ROOT) {
+      return Set.of(RootTable.ID);
+    } else if (level == DataLevel.METADATA) {
+      return Collections.singleton(MetadataTable.ID);
+    } else if (level == DataLevel.USER) {
+      Set<TableId> tableIds = new HashSet<>();
+      getTableIDs().forEach((k, v) -> {
+        if (v == TableState.ONLINE || v == TableState.OFFLINE) {
+          // Don't return tables that are NEW, DELETING, or in an
+          // UNKNOWN state.
+          tableIds.add(k);
+        }
+      });
+      tableIds.remove(MetadataTable.ID);
+      tableIds.remove(RootTable.ID);
+      return tableIds;
+    } else {
+      throw new IllegalArgumentException("Unexpected Table in GC Env: " + 
this.level.name());

Review Comment:
   Unexpected level, not unexpected table. Also, since DataLevel is an enum, it 
might make more sense to use a switch statement, and have this case be the 
default case, rather than use a series of if else statements.



##########
server/gc/src/main/java/org/apache/accumulo/gc/GCRun.java:
##########
@@ -171,8 +180,44 @@ public Stream<Reference> getReferences() {
   }
 
   @Override
-  public Set<TableId> getTableIDs() {
-    return context.getTableIdToNameMap().keySet();
+  public Map<TableId,TableState> getTableIDs() throws InterruptedException {
+    final String tablesPath = context.getZooKeeperRoot() + Constants.ZTABLES;
+    final ZooReader zr = context.getZooReader();
+    int retries = 1;
+    UncheckedIOException ioe = null;
+    while (retries <= 10) {
+      try {
+        zr.sync(tablesPath);
+        final Map<TableId,TableState> tids = new HashMap<>();
+        for (String table : zr.getChildren(tablesPath)) {
+          TableId tableId = TableId.of(table);
+          TableState tableState = null;
+          String statePath = context.getZooKeeperRoot() + Constants.ZTABLES + 
"/"
+              + tableId.canonical() + Constants.ZTABLE_STATE;
+          try {
+            byte[] state = zr.getData(statePath, null, null);
+            if (state == null) {
+              tableState = TableState.UNKNOWN;
+            } else {
+              tableState = TableState.valueOf(new String(state, UTF_8));
+            }
+          } catch (NoNodeException e) {
+            tableState = TableState.UNKNOWN;
+          }
+          tids.put(tableId, tableState);
+        }
+        return tids;
+      } catch (KeeperException e) {
+        retries++;
+        if (ioe == null) {
+          ioe = new UncheckedIOException(new IOException("Error getting table 
ids from ZooKeeper"));

Review Comment:
   IOException seems a bit forced here. IllegalStateException might be more 
appropriate, and wouldn't require this nesting of objects.



##########
server/gc/src/test/java/org/apache/accumulo/gc/GarbageCollectionTest.java:
##########
@@ -122,6 +158,29 @@ public void incrementInUseStat(long i) {}
     public Iterator<Entry<String,Status>> getReplicationNeededIterator() {
       return filesToReplicate.entrySet().iterator();
     }
+
+    @Override
+    public Set<TableId> getCandidateTableIDs() {
+      if (level == Ample.DataLevel.ROOT) {
+        return Set.of(RootTable.ID);
+      } else if (level == Ample.DataLevel.METADATA) {
+        return Collections.singleton(MetadataTable.ID);
+      } else if (level == Ample.DataLevel.USER) {
+        Set<TableId> tableIds = new HashSet<>();
+        getTableIDs().forEach((k, v) -> {
+          if (v == TableState.ONLINE || v == TableState.OFFLINE) {
+            // Don't return tables that are NEW, DELETING, or in an
+            // UNKNOWN state.
+            tableIds.add(k);
+          }
+        });
+        tableIds.remove(MetadataTable.ID);
+        tableIds.remove(RootTable.ID);
+        return tableIds;
+      } else {
+        throw new IllegalArgumentException("unknown level " + level);
+      }

Review Comment:
   Not going to rewrite this at the moment, but it should be improved similar 
to my previous comments.



##########
server/gc/src/main/java/org/apache/accumulo/gc/GCRun.java:
##########
@@ -462,4 +507,39 @@ public long getErrorsStat() {
   public long getCandidatesStat() {
     return candidates;
   }
+
+  /**
+   * Return a set of all TableIDs in the
+   * {@link org.apache.accumulo.core.metadata.schema.Ample.DataLevel} for 
which we are considering
+   * deletes. When operating on DataLevel.USER this will return all user table 
ids in an ONLINE or

Review Comment:
   I still think this wording is vague. What does it mean to be "operating on" 
a particular DataLevel?



##########
server/gc/src/main/java/org/apache/accumulo/gc/GarbageCollectionAlgorithm.java:
##########
@@ -216,6 +224,46 @@ private long 
removeBlipCandidates(GarbageCollectionEnvironment gce,
     return blipCount;
   }
 
+  @VisibleForTesting
+  /**
+   * Double check no tables were missed during GC
+   */
+  protected void ensureAllTablesChecked(Set<TableId> tableIdsBefore, 
Set<TableId> tableIdsSeen,
+      Set<TableId> tableIdsAfter) {
+
+    // if a table was added or deleted during this run, it is acceptable to not
+    // have seen those tables ids when scanning the metadata table. So get the 
intersection
+    final Set<TableId> tableIdsMustHaveSeen = new HashSet<>(tableIdsBefore);
+    tableIdsMustHaveSeen.retainAll(tableIdsAfter);
+
+    if (tableIdsMustHaveSeen.isEmpty() && !tableIdsSeen.isEmpty()) {
+      throw new RuntimeException("Garbage collection will not proceed because "
+          + "table ids were seen in the metadata table and none were seen 
Zookeeper. "
+          + "This can have two causes. First, total number of tables going 
to/from "
+          + "zero during a GC cycle will cause this. Second, it could be 
caused by "
+          + "corruption of the metadata table and/or Zookeeper. Only the 
second cause "
+          + "is problematic, but there is no way to distinguish between the 
two causes "
+          + "so this GC cycle will not proceed. The first cause should be 
transient "
+          + "and one would not expect to see this message repeated in 
subsequent GC cycles.");
+    }

Review Comment:
   This case shouldn't kill the garbage collector. This is a normal case when a 
user recently initializes a cluster. We specifically have the initial GC delay 
shorter (30s) than the cycle delay (5m), which makes this more likely to occur 
if the first table they create is around the 30 second mark when the first GC 
cycle is starting to run. We don't want to make it more likely that first-time 
users are going to see the GC crash just because they created their first table 
in Accumulo!
   
   We should be able to skip this cycle with a warning rather than killing the 
GC with a RTE.



##########
server/gc/src/main/java/org/apache/accumulo/gc/GarbageCollectionAlgorithm.java:
##########
@@ -216,6 +224,46 @@ private long 
removeBlipCandidates(GarbageCollectionEnvironment gce,
     return blipCount;
   }
 
+  @VisibleForTesting
+  /**
+   * Double check no tables were missed during GC
+   */
+  protected void ensureAllTablesChecked(Set<TableId> tableIdsBefore, 
Set<TableId> tableIdsSeen,
+      Set<TableId> tableIdsAfter) {
+
+    // if a table was added or deleted during this run, it is acceptable to not
+    // have seen those tables ids when scanning the metadata table. So get the 
intersection
+    final Set<TableId> tableIdsMustHaveSeen = new HashSet<>(tableIdsBefore);
+    tableIdsMustHaveSeen.retainAll(tableIdsAfter);
+
+    if (tableIdsMustHaveSeen.isEmpty() && !tableIdsSeen.isEmpty()) {
+      throw new RuntimeException("Garbage collection will not proceed because "
+          + "table ids were seen in the metadata table and none were seen 
Zookeeper. "
+          + "This can have two causes. First, total number of tables going 
to/from "
+          + "zero during a GC cycle will cause this. Second, it could be 
caused by "
+          + "corruption of the metadata table and/or Zookeeper. Only the 
second cause "
+          + "is problematic, but there is no way to distinguish between the 
two causes "
+          + "so this GC cycle will not proceed. The first cause should be 
transient "
+          + "and one would not expect to see this message repeated in 
subsequent GC cycles.");
+    }
+
+    // From that intersection, remove all the table ids that were seen.
+    tableIdsMustHaveSeen.removeAll(tableIdsSeen);
+
+    // If anything is left then we missed a table and may not have removed 
rfiles references
+    // from the candidates list that are acutally still in use, which would
+    // result in the rfiles being deleted in the next step of the GC process
+    if (!tableIdsMustHaveSeen.isEmpty()) {
+      log.error("TableIDs before: " + tableIdsBefore);
+      log.error("TableIDs after : " + tableIdsAfter);
+      log.error("TableIDs seen  : " + tableIdsSeen);
+      log.error("TableIDs that should have been seen but were not: " + 
tableIdsMustHaveSeen);
+      // maybe a scan failed?
+      throw new IllegalStateException(
+          "Saw table IDs in ZK that were not in metadata table:  " + 
tableIdsMustHaveSeen);

Review Comment:
   I feel like we should either log or throw, but not both. If we throw, then 
we can rely on the caller to log the message. If we log, and throw, then we're 
probably getting duplicate error messages in the logs, that are not obviously 
the same problem.



##########
server/gc/src/main/java/org/apache/accumulo/gc/GarbageCollectionAlgorithm.java:
##########
@@ -216,6 +224,46 @@ private long 
removeBlipCandidates(GarbageCollectionEnvironment gce,
     return blipCount;
   }
 
+  @VisibleForTesting
+  /**
+   * Double check no tables were missed during GC
+   */
+  protected void ensureAllTablesChecked(Set<TableId> tableIdsBefore, 
Set<TableId> tableIdsSeen,
+      Set<TableId> tableIdsAfter) {
+
+    // if a table was added or deleted during this run, it is acceptable to not
+    // have seen those tables ids when scanning the metadata table. So get the 
intersection
+    final Set<TableId> tableIdsMustHaveSeen = new HashSet<>(tableIdsBefore);
+    tableIdsMustHaveSeen.retainAll(tableIdsAfter);
+
+    if (tableIdsMustHaveSeen.isEmpty() && !tableIdsSeen.isEmpty()) {
+      throw new RuntimeException("Garbage collection will not proceed because "
+          + "table ids were seen in the metadata table and none were seen 
Zookeeper. "
+          + "This can have two causes. First, total number of tables going 
to/from "
+          + "zero during a GC cycle will cause this. Second, it could be 
caused by "
+          + "corruption of the metadata table and/or Zookeeper. Only the 
second cause "
+          + "is problematic, but there is no way to distinguish between the 
two causes "
+          + "so this GC cycle will not proceed. The first cause should be 
transient "
+          + "and one would not expect to see this message repeated in 
subsequent GC cycles.");
+    }
+
+    // From that intersection, remove all the table ids that were seen.
+    tableIdsMustHaveSeen.removeAll(tableIdsSeen);
+
+    // If anything is left then we missed a table and may not have removed 
rfiles references
+    // from the candidates list that are acutally still in use, which would
+    // result in the rfiles being deleted in the next step of the GC process
+    if (!tableIdsMustHaveSeen.isEmpty()) {
+      log.error("TableIDs before: " + tableIdsBefore);
+      log.error("TableIDs after : " + tableIdsAfter);
+      log.error("TableIDs seen  : " + tableIdsSeen);
+      log.error("TableIDs that should have been seen but were not: " + 
tableIdsMustHaveSeen);

Review Comment:
   One formatted error message should be enough. There's no need for 4 separate 
log entries. Having 4 will age out other recently seen errors on the monitor 
page that shows recent logs faster than necessary, and in a multi-threaded 
system, these can be split up by logs from other threads, making it harder to 
parse and display these using automated tooling.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [accumulo] ctubbsii commented on a diff in pull request #2792: closes #1377 - ensure all tables are checked ...

Reply via email to