hemantk-12 commented on code in PR #4163:
URL: https://github.com/apache/ozone/pull/4163#discussion_r1091338778
##########
hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDBCheckpointDiffer.java:
##########
@@ -1271,6 +1279,30 @@ private SnapshotLogInfo(long snapshotGenerationId,
}
}
+ /**
+ * Defines the task that removes SST files from backup directory which are
+ * not needed to generate snapshot diff using compaction DAG to clean
+ * the disk space.
+ * We can’t simply delete input files in the compaction completed listener
+ * because it is not known which of input files are from previous compaction
+ * and which were created after the compaction.
+ * We can remove SST files which were created from the compaction because
+ * those are not needed to generate snapshot diff. These files are basically
+ * non-leaf nodes of the DAG.
Review Comment:
Thanks :)
##########
hadoop-hdds/rocksdb-checkpoint-differ/src/test/java/org/apache/ozone/rocksdiff/TestRocksDBCheckpointDiffer.java:
##########
@@ -1100,4 +1114,108 @@ public void
testPruneOlderSnapshotsWithCompactionHistory(
deleteDirectory(compactionLogDir);
deleteDirectory(sstBackUpDir);
}
+
+ private static Stream<Arguments> sstFilePruningScenarios() {
+ return Stream.of(
+ Arguments.of("Case 1: No compaction.",
+ "",
+ Arrays.asList("000015", "000013", "000011", "000009"),
+ Arrays.asList("000015", "000013", "000011", "000009")
+ ),
+ Arguments.of("Case 2: One level compaction.",
+ "C 000015,000013,000011,000009:000018,000016,000017\n",
+ Arrays.asList("000015", "000013", "000011", "000009", "000018",
+ "000016", "000017", "000026", "000024", "000022", "000020"),
+ Arrays.asList("000015", "000013", "000011", "000009", "000026",
+ "000024", "000022", "000020")
+ ),
Review Comment:
No actually.
Let's take case 2 for example:
Level 1: "000015", "000013", "000011", "000009"
Level 2: "000018", "000016", "000017", "000026", "000024", "000022", "000020"
files: "000018", "000016", "000017" are from compaction and "000026",
"000024", "000022", "000020" are new files at level 2.
For diff, we only need "000015", "000013", "000011", "000009", "000026",
"000024", "000022", "000020" files.
If snapshot is taken before compaction "000015", "000013", "000011",
"000009" files are needed to generate diff.
And if compaction is taken after compaction "000015", "000013", "000011",
"000009", "000026", "000024", "000022", "000020" are needed. "000018",
"000016", "000017" are common files and won't generate any diff.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]