mjwall commented on a change in pull request #2293:
URL: https://github.com/apache/accumulo/pull/2293#discussion_r732102682



##########
File path: 
server/gc/src/test/java/org/apache/accumulo/gc/GarbageCollectionTest.java
##########
@@ -644,4 +699,132 @@ public void bulkImportReplicationRecordsPreventDeletion() 
throws Exception {
     assertEquals(1, gce.deletes.size());
     assertEquals("hdfs://foo.com:6000/accumulo/tables/2/t-00002/A000002.rf", 
gce.deletes.get(0));
   }
+
+  @Test
+  public void testMissingTableIds() throws Exception {
+    GarbageCollectionAlgorithm gca = new GarbageCollectionAlgorithm();
+
+    TestGCE gce = new TestGCE();
+
+    gce.candidates.add("hdfs://foo.com:6000/user/foo/tables/a/t-0/F00.rf");
+
+    gce.addFileReference("a", null, 
"hdfs://foo.com:6000/user/foo/tables/a/t-0/F00.rf");
+    gce.addFileReference("c", null, 
"hdfs://foo.com:6000/user/foo/tables/c/t-0/F00.rf");
+
+    // the following table ids must be seen in the references
+    gce.tableIds.add("a");
+    gce.tableIds.add("b");
+    gce.tableIds.add("c");
+    gce.tableIds.add("d");
+
+    String msg = Assert.assertThrows(RuntimeException.class, () -> 
gca.collect(gce)).getMessage();
+    Assert.assertTrue(msg, (msg.contains("[b, d]") || msg.contains("[d, b]"))
+        && msg.contains("Saw table IDs in ZK that were not in metadata 
table:"));
+  }
+
+  // below are tests for potential failure conditions of the GC process. Some 
of these cases were
+  // observed on clusters. Some were hypothesis based on observations. The 
result was that
+  // candidate entries were not removed when they should have been and 
therefore files were
+  // removed from HDFS that were actually still in use
+
+  private Set<String> makeUnmodifiableSet(String... args) {
+    return Collections.unmodifiableSet(new HashSet<>(Arrays.asList(args)));
+  }
+
+  @Test
+  public void testNormalGCRun() {
+    // happy path, no tables added or removed during this portion and all the 
tables checked
+    Set<String> tablesBefore = makeUnmodifiableSet("1", "2", "3");
+    Set<String> tablesSeen = makeUnmodifiableSet("2", "1", "3");
+    Set<String> tablesAfter = makeUnmodifiableSet("1", "3", "2");
+
+    new GarbageCollectionAlgorithm().ensureAllTablesChecked(tablesBefore, 
tablesSeen, tablesAfter);
+  }
+
+  @Test
+  public void testTableAddedInMiddle() {
+    // table was added during this portion and we don't see it, should be fine
+    Set<String> tablesBefore = makeUnmodifiableSet("1", "2", "3");

Review comment:
       It would work for Java 1.9 and above.  I know I have to backport this to 
1.9 which is still 1.8.  I can make the change here to take advantage of the 
new API and then use my hacky method when I backport.  Thanks @DomGarguilo 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to