swamirishi commented on code in PR #5511:
URL: https://github.com/apache/ozone/pull/5511#discussion_r1388789480


##########
hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDBCheckpointDiffer.java:
##########
@@ -469,23 +458,38 @@ public void onCompactionBegin(RocksDB db,
             return;
           }
         }
+        createHardLinks(compactionJobInfo.inputFiles());
+      }
+    };
+  }
 
-        // Create hardlink backups for the SST files that are going
-        // to be deleted after this RDB compaction.
-        for (String file : compactionJobInfo.inputFiles()) {
-          LOG.debug("Creating hard link for '{}'", file);
+  @VisibleForTesting
+  void createHardLinks(List<String> sourceFiles) {
+    try (ManagedOptions options = new ManagedOptions();
+         ManagedReadOptions readOptions = new ManagedReadOptions()) {
+      // Create hardlink backups for the SST files that are going
+      // to be deleted after this RDB compaction.
+      for (String file : sourceFiles) {
+        CompactionFileInfo fileInfo =
+            toFileInfo(file, options, readOptions);
+        if (shouldSkipFile(fileInfo)) {

Review Comment:
   you need not check every file. Just checking the first file is enough



##########
hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDBCheckpointDiffer.java:
##########
@@ -1531,50 +1533,79 @@ public void pngPrintMutableGraph(String filePath, 
GraphType graphType)
     graph.generateImage(filePath);
   }
 
-  private List<CompactionFileInfo> toFileInfoList(
-      List<String> sstFiles,
-      ManagedOptions options,
-      ManagedReadOptions readOptions
+  /**
+   * Returns true is compactionFileInfo has column family information and
+   * doesn't belong to any of the COLUMN_FAMILIES_TO_TRACK_IN_DAG.
+   * Otherwise, false and should be added compaction DAG.
+   * CompactionFileInfo doesn't have column family information in two cases,
+   * i). Backward compatibility. Before HDDS-8940, column family is not
+   *     persisted.
+   * ii). In case, it fails to read SST in {@link this.toFileInfo()}.
+   */
+  @VisibleForTesting
+  boolean shouldSkipFile(CompactionFileInfo compactionFileInfo) {
+    return Objects.nonNull(compactionFileInfo.getColumnFamily()) &&
+        !COLUMN_FAMILIES_TO_TRACK_IN_DAG.contains(
+            compactionFileInfo.getColumnFamily());
+  }
+
+  private List<CompactionFileInfo> toFileInfoList(List<String> sstFiles,
+                                                  ManagedOptions options,
+                                                  ManagedReadOptions 
readOptions
   ) {
     if (CollectionUtils.isEmpty(sstFiles)) {
       return Collections.emptyList();
     }
 
-    final int fileNameOffset = sstFiles.get(0).lastIndexOf("/") + 1;
     List<CompactionFileInfo> response = new ArrayList<>();
 
     for (String sstFile : sstFiles) {
-      String fileName = sstFile.substring(fileNameOffset,
-          sstFile.length() - SST_FILE_EXTENSION_LENGTH);
-      CompactionFileInfo.Builder fileInfoBuilder =
-          new CompactionFileInfo.Builder(fileName);
-      SstFileReader fileReader = new SstFileReader(options);
-      try {
-        fileReader.open(sstFile);
-        String columnFamily = StringUtils.bytes2String(
-            fileReader.getTableProperties().getColumnFamilyName());
-        SstFileReaderIterator iterator = fileReader.newIterator(readOptions);
-        iterator.seekToFirst();
-        String startKey = StringUtils.bytes2String(iterator.key());
-        iterator.seekToLast();
-        String endKey = StringUtils.bytes2String(iterator.key());
-        fileInfoBuilder.setStartRange(startKey)
-            .setEndRange(endKey)
-            .setColumnFamily(columnFamily);
-      } catch (RocksDBException rocksDBException) {
-        // Ideally it should not happen. If it does just log the exception.
-        // And let the compaction complete without the exception.
-        // Throwing exception in compaction listener could fail the RocksDB.
-        // In case of exception, compaction node will be missing start key,
-        // end key and column family. And it will continue the traversal as
-        // it was before HDDS-8940.
-        LOG.warn("Failed to read SST file: {}.", sstFile, rocksDBException);
+      CompactionFileInfo fileInfo = toFileInfo(sstFile, options, readOptions);
+      if (shouldSkipFile(fileInfo)) {
+        LOG.debug("Skipping sst file: '{}' belongs to column family '{}'.",

Review Comment:
   You would be skipping the entire compaction, you need not log for individual 
log file instead say skipping input files & output files are this.



##########
hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDBCheckpointDiffer.java:
##########
@@ -455,9 +447,6 @@ public void onCompactionBegin(RocksDB db,
           return;
         }

Review Comment:
   Combine both the if statements. Unable to make the suggestion there. In this 
case the other changes need not be done. Add a unit test for this function 
alone.



##########
hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDBCheckpointDiffer.java:
##########
@@ -188,6 +178,8 @@ public class RocksDBCheckpointDiffer implements 
AutoCloseable,
   private ColumnFamilyHandle compactionLogTableCFHandle;
   private RocksDB activeRocksDB;
 
+  public static final Set<String> COLUMN_FAMILIES_TO_TRACK_IN_DAG =

Review Comment:
   Any reason why you need public?



##########
hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDBCheckpointDiffer.java:
##########
@@ -455,9 +447,6 @@ public void onCompactionBegin(RocksDB db,
           return;
         }

Review Comment:
   CompactionJobInfo has function to get the column family. You can skip the 
whole compaction listener here. No need for checking individual files. 
https://github.com/facebook/rocksdb/blob/dfaf4dc111ff090800f2765fdcd88d48a6abbab8/java/src/main/java/org/rocksdb/CompactionJobInfo.java#L32
   
   ```suggestion
           }
           if 
(!COLUMN_FAMILIES_TO_TRACK_IN_DAG.contains(StringUtils.bytes2String(compactionJobInfo.columnFamilyName()))
 {
           return;
      }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to