hemantk-12 commented on code in PR #3981:
URL: https://github.com/apache/ozone/pull/3981#discussion_r1040043801


##########
hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/CompactionNode.java:
##########
@@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.ozone.rocksdiff;
+
+/**
+ * Node in the compaction DAG that represents an SST file.
+ */
+public class CompactionNode {
+  // Name of the SST file
+  private final String fileName;
+  // The last snapshot created before this node came into existence
+  private final String snapshotId;

Review Comment:
   What difference does it make from `fileName` if it can be any random string? 
Why can't we just use `fileName`?



##########
hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/CompactionNode.java:
##########
@@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.ozone.rocksdiff;
+
+/**
+ * Node in the compaction DAG that represents an SST file.
+ */
+public class CompactionNode {
+  // Name of the SST file
+  private final String fileName;
+  // The last snapshot created before this node came into existence
+  private final String snapshotId;
+  private final long snapshotGeneration;
+  private final long totalNumberOfKeys;
+  private long cumulativeKeysReverseTraversal;
+
+  CompactionNode(String file, String ssId, long numKeys, long seqNum) {

Review Comment:
   Don't have very strong reason to make it public. You can keep it as it is.
   
   To me, it is a simple data class. Don't see any issue if it is made public.



##########
hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/utils/db/DBStoreBuilder.java:
##########
@@ -298,6 +308,18 @@ private ManagedColumnFamilyOptions getDefaultCfOptions() {
         .orElseGet(defaultCfProfile::getColumnFamilyOptions);
   }
 
+  /**
+   * Get default column family options, but with column family write buffer
+   * size limit overridden.
+   * @param writeBufferSize Specify column family write buffer size.
+   * @return ManagedColumnFamilyOptions
+   */
+  private ManagedColumnFamilyOptions getDefaultCfOptions(long writeBufferSize) 
{

Review Comment:
   I will disagree that it is still the "default" but I'll leave it upto you.
   
   I think the real confusion here is if you are creating default config or 
using default config when config is absent. We are doing the later, using 
default config when config is not provided.



##########
hadoop-hdds/rocksdb-checkpoint-differ/src/test/java/org/apache/ozone/rocksdiff/TestRocksDBCheckpointDiffer.java:
##########
@@ -90,8 +101,145 @@ public static void init() {
     GenericTestUtils.setLogLevel(TestRocksDBCheckpointDiffer.LOG, Level.INFO);
   }
 
+  /**
+   * Test cases for testGetSSTDiffListWithoutDB.
+   */
+  private static Stream<Arguments> testGetSSTDiffListWithoutDBCases() {
+
+    DifferSnapshotInfo snapshotInfo1 = new DifferSnapshotInfo(
+        "/path/to/dbcp1", "ssUUID1", 3008L);
+    DifferSnapshotInfo snapshotInfo2 = new DifferSnapshotInfo(
+        "/path/to/dbcp2", "ssUUID2", 14980L);
+    DifferSnapshotInfo snapshotInfo3 = new DifferSnapshotInfo(
+        "/path/to/dbcp3", "ssUUID3", 17975L);
+    DifferSnapshotInfo snapshotInfo4 = new DifferSnapshotInfo(
+        "/path/to/dbcp4", "ssUUID4", 18000L);
+
+    Set<String> snapshotSstFiles1 = new HashSet<>(asList(
+        "000059", "000053"));
+    Set<String> snapshotSstFiles2 = new HashSet<>(asList(
+        "000088", "000059", "000053", "000095"));
+    Set<String> snapshotSstFiles3 = new HashSet<>(asList(
+        "000088", "000105", "000059", "000053", "000095"));
+    Set<String> snapshotSstFiles4 = new HashSet<>(asList(
+        "000088", "000105", "000059", "000053", "000095", "000108"));
+    Set<String> snapshotSstFiles1Alt1 = new HashSet<>(asList(
+        "000059", "000053", "000066"));
+    Set<String> snapshotSstFiles1Alt2 = new HashSet<>(asList(
+        "000059", "000053", "000052"));
+    Set<String> snapshotSstFiles2Alt2 = new HashSet<>(asList(
+        "000088", "000059", "000053", "000095", "000099"));
+    Set<String> snapshotSstFiles2Alt3 = new HashSet<>(asList(
+        "000088", "000059", "000053", "000062"));
+
+    return Stream.of(
+        Arguments.of("Test 1: Regular case. Expands expandable " +
+                "SSTs in the initial diff.",
+            snapshotInfo3,
+            snapshotInfo1,
+            snapshotSstFiles3,
+            snapshotSstFiles1,
+            new HashSet<>(asList("000059", "000053")),
+            new HashSet<>(asList(
+                "000066", "000105", "000080", "000087", "000073", "000095"))),
+        Arguments.of("Test 2: Crafted input: One source " +
+                "('to' snapshot) SST file is never compacted (newly flushed)",
+            snapshotInfo4,
+            snapshotInfo3,
+            snapshotSstFiles4,
+            snapshotSstFiles3,
+            new HashSet<>(asList(
+                "000088", "000105", "000059", "000053", "000095")),
+            new HashSet<>(asList("000108"))),
+        Arguments.of("Test 3: Crafted input: Same SST files " +
+                "found during SST expansion",
+            snapshotInfo2,
+            snapshotInfo1,
+            snapshotSstFiles2,
+            snapshotSstFiles1Alt1,
+            new HashSet<>(asList("000066", "000059", "000053")),
+            new HashSet<>(asList(
+                "000080", "000087", "000073", "000095"))),
+        Arguments.of("Test 4: Crafted input: Skipping known " +
+                "processed SST.",
+            snapshotInfo2,
+            snapshotInfo1,
+            snapshotSstFiles2Alt2,
+            snapshotSstFiles1Alt2,
+            new HashSet<>(),
+            new HashSet<>()),
+        Arguments.of("Test 5: Hit snapshot generation early exit " +
+                "condition",
+            snapshotInfo2,
+            snapshotInfo1,
+            snapshotSstFiles2Alt3,
+            snapshotSstFiles1,
+            new HashSet<>(asList("000059", "000053")),
+            new HashSet<>(asList(
+                "000066", "000080", "000087", "000073", "000062")))
+    );
+  }
+
+  /**
+   * Tests core SST diff list logic. Does not involve DB.
+   * Focuses on testing edge cases in internalGetSSTDiffList().
+   */
+  @ParameterizedTest(name = "{0}")
+  @MethodSource("testGetSSTDiffListWithoutDBCases")
+  public void testGetSSTDiffListWithoutDB(String description,
+      DifferSnapshotInfo srcSnapshot,
+      DifferSnapshotInfo destSnapshot,
+      Set<String> srcSnapshotSstFiles,
+      Set<String> destSnapshotSstFiles,
+      Set<String> expectedSameSstFiles,
+      Set<String> expectedDiffSstFiles) {
+
+    RocksDBCheckpointDiffer differ = new RocksDBCheckpointDiffer();
+
+    String compactionLog = ""
+        + "S 1000 df6410c7-151b-4e90-870e-5ef12875acd5\n"  // Snapshot 0
+        + "C 000001,000002:000062\n"
+          // Additional "compaction" to trigger and test early exit condition
+        + "S 3008 ef6410c7-151b-4e90-870e-5ef12875acd5\n"  // Snapshot 1
+        + "C 000068,000062:000069\n"  // Regular compaction
+        + "C 000071,000064,000060,000052:000071,000064,000060,000052\n"
+          // Trivial move
+        + "C 000073,000066:000074\n"
+        + "C 000082,000076,000069:000083\n"
+        + "C 000087,000080,000074:000088\n"
+        + "C 000093,000090,000083:\n"  // Deletion?
+        + "S 14980 e7ad72f8-52df-4430-93f6-0ee91d4a47fd\n"  // Snapshot 2
+        + "C 000098,000096,000085,000078,000071,000064,000060,000052:000099\n"
+        + "C 000105,000095,000088:000107\n"
+        + "S 17975 4f084f6e-ed3d-4780-8362-f832303309ea\n";  // Snapshot 3
+
+    // Construct DAG from compaction log input
+    Arrays.stream(compactionLog.split("\n")).forEach(
+        differ::processCompactionLogLine);
+
+    Set<String> actualSameSstFiles = new HashSet<>();
+    Set<String> actualDiffSstFiles = new HashSet<>();
+    differ.internalGetSSTDiffList(
+        srcSnapshot,
+        destSnapshot,
+        srcSnapshotSstFiles,
+        destSnapshotSstFiles,
+        differ.getForwardCompactionDAG(),
+        actualSameSstFiles,
+        actualDiffSstFiles);
+    // Check same and different SST files result
+    Assertions.assertEquals(expectedSameSstFiles, actualSameSstFiles);
+    Assertions.assertEquals(expectedDiffSstFiles, actualDiffSstFiles);
+  }
+
+  /**
+   * Tests DB listener (compaction log generation, SST backup),
+   * SST file list diff.
+   * <p>
+   * Does actual DB write, flush, compaction.
+   */
   @Test
-  void testMain() throws Exception {
+  void testWithDB() throws Exception {

Review Comment:
   I think it is not generally a good practice to use java doc and comments to 
tell what test is doing. Test name should be good enough for that.
   
   Also this test is doing a lot of things. Anyways you can leave it and it 
might be fixed as part of https://issues.apache.org/jira/browse/HDDS-7566.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to