[ 
https://issues.apache.org/jira/browse/HDDS-8940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17755663#comment-17755663
 ] 

Hemant Kumar edited comment on HDDS-8940 at 8/18/23 6:05 PM:
-------------------------------------------------------------

The problem is that 
[SSTFilteringService|https://github.com/apache/ozone/blob/master/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/SstFilteringService.java]
 and [SST pruning 
service|https://github.com/apache/ozone/blob/master/hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDBCheckpointDiffer.java#L1483]
 work independently and try to optimize the space by deleting unnecessary SST 
files. *SSTFilteringService* deletes some files which don't belongs to the 
snapshotted bucket and SST prune service deletes the file which are not 
required for diff calculations. On the other hand compaction DAG is global at 
Ozone level and is kind a not aware of the above two clean ups.

Now lets take our favorite example: 

  !example.png! 

1. Compaction from level-1 to level2 happens before snapshot-1 is taken.
2. Snapshot-1 is taken at level 2 (files: 000018.sst, 000016.sst, 000017.sst, 
000026.sst, 000024.sst, 000022.sst, 000020.sst).
3. To keep the example simple, lets assume, *SSTFilteringService* kicks in and 
deletes files: 000018.sst, 000016.sst, 000017.sst from the checkpoint directory 
of snapshot-1 (on actual cluster, compaction happen in parallel so not all 
files will be deleted, please check attached example, 
[Compaction_Dag.png|https://issues.apache.org/jira/secure/attachment/13060984/HDDS-8940_Compaction_Dag.png],
 
[Compaction_Dag_1.png|https://issues.apache.org/jira/secure/attachment/13062229/HDDS-8940_Compaction_Dag_1.png],
 
[Compaction_Dag_2.png|https://issues.apache.org/jira/secure/attachment/13062230/HDDS-8940_Compaction_Dag_2.png])
 because they are not needed for snapshot-1.
4. At the same time, SST pruning service deletes all the non-leaf nodes: 
000015.sst, 000013.sst, 000011.sst, 000009.sst, 000018.sst, 000016.sst, 
000017.sst.
5. Snapshot-2 is taken at level 5 (files: 000059.sst, 000055.sst, 000056.sst, 
000060.sst, 000057.sst, 000058.sst, and any other new files).
6. Snapshot diff job is submitted for Snapshot-2 and snapshot-1. On the DAG 
traversal, when it reaches to level 3, nodes 000018.sst, 000016.sst, 000017.sst 
are added to traversal because there are not preset in snapshot-1's checkpoint 
dir 
[code|https://github.com/apache/ozone/blob/c801c02455982d3488cb099942f86912a492dc89/hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDBCheckpointDiffer.java#L1049].
 On the next level: level-2, 000018.sst, 000016.sst, 000017.sst are added to 
diff files because those nodes were created before the snapshot was taken 
[condition|https://github.com/apache/ozone/blob/c801c02455982d3488cb099942f86912a492dc89/hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDBCheckpointDiffer.java#L1024].
7. When we look for the files on the final step before [returning it to 
SnapshotDiffManager|https://github.com/apache/ozone/blob/c801c02455982d3488cb099942f86912a492dc89/hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDBCheckpointDiffer.java#L877],
 we look for the files in either [active DB dir and SST backup 
dir|https://github.com/apache/ozone/blob/c801c02455982d3488cb099942f86912a492dc89/hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDBCheckpointDiffer.java#L832].
 Active DB dir doesn't have these files because they were compacted and SST 
backup dir doesn't have because of SST pruning service. And fail the request.
8. Ideally 000018.sst, 000016.sst, 000017.sst are not needed for snapshot diff 
of snapshot-1 and snapshot-2. And compaction DAG traversal should not have 
added them to the diff list but it did because it is global and not aware of 
space optimizations.



was (Author: JIRAUSER297350):
The problem is that 
[SSTFilteringService|https://github.com/apache/ozone/blob/master/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/SstFilteringService.java]
 and [SST pruning 
service|https://github.com/apache/ozone/blob/master/hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDBCheckpointDiffer.java#L1483]
 work independently and try to optimize the space by deleting unnecessary SST 
files. *SSTFilteringService* deletes some files which don't belongs to the 
snapshotted bucket and SST prune service deletes the file which are not 
required for diff calculations. On the other hand compaction DAG is global at 
Ozone level and is kind a not aware of the above two clean ups.

Now lets take our favorite example: 

  !example.png! 

1. Compaction from level-1 to level2 happens before snapshot-1 is taken.
2. Snapshot-1 is taken at level 2 (files: 000018.sst, 000016.sst, 000017.sst, 
000026.sst, 000024.sst, 000022.sst, 000020.sst).
3. To keep the example simple, lets assume, *SSTFilteringService* kicks in and 
deletes files: 000018.sst, 000016.sst, 000017.sst from the checkpoint directory 
of snapshot-1 (on actual cluster, compaction happen in parallel so not all 
files will be deleted, please check attached example, 
[Compaction_Dag.png|https://issues.apache.org/jira/secure/attachment/13060984/HDDS-8940_Compaction_Dag.png],
 
[Compaction_Dag_1.png|https://issues.apache.org/jira/secure/attachment/13062229/HDDS-8940_Compaction_Dag_1.png],
 
[Compaction_Dag_2.png|https://issues.apache.org/jira/secure/attachment/13062230/HDDS-8940_Compaction_Dag_2.png])
 because they are not needed for snapshot-1.
4. At the same time, SST pruning service deletes all the non-leaf nodes: 
000015.sst, 000013.sst, 000011.sst, 000009.sst, 000018.sst, 000016.sst, 
000017.sst, 000027.sst, 000030.sst, 000028.sst, 000031.sst, 000029.sst, 
000040.sst, 000044.sst, 000042.sst, 000043.sst, 000046.sst, 000041.sst, 
000045.sst, 000059.sst, 000055.sst, 000056.sst, 000060.sst, 000057.sst, 
000058.sst.
5. Snapshot-2 is taken at level 5 (files: 000059.sst, 000055.sst, 000056.sst, 
000060.sst, 000057.sst, 000058.sst, and any other new files).
6. Snapshot diff job is submitted for Snapshot-2 and snapshot-1. On the DAG 
traversal, when it reaches to level 3, nodes 000018.sst, 000016.sst, 000017.sst 
are added to traversal because there are not preset in snapshot-1's checkpoint 
dir 
[code|https://github.com/apache/ozone/blob/c801c02455982d3488cb099942f86912a492dc89/hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDBCheckpointDiffer.java#L1049].
 On the next level: level-2, 000018.sst, 000016.sst, 000017.sst are added to 
diff files because those nodes were created before the snapshot was taken 
[condition|https://github.com/apache/ozone/blob/c801c02455982d3488cb099942f86912a492dc89/hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDBCheckpointDiffer.java#L1024].
7. When we look for the files on the final step before [returning it to 
SnapshotDiffManager|https://github.com/apache/ozone/blob/c801c02455982d3488cb099942f86912a492dc89/hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDBCheckpointDiffer.java#L877],
 we look for the files in either [active DB dir and SST backup 
dir|https://github.com/apache/ozone/blob/c801c02455982d3488cb099942f86912a492dc89/hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDBCheckpointDiffer.java#L832].
 Active DB dir doesn't have these files because they were compacted and SST 
backup dir doesn't have because of SST pruning service. And fail the request.
8. Ideally 000018.sst, 000016.sst, 000017.sst are not needed for snapshot diff 
of snapshot-1 and snapshot-2. And compaction DAG traversal should not have 
added them to the diff list but it did because it is global and not aware of 
space optimizations.


> SST files are missing on optimized snapDiff path.
> -------------------------------------------------
>
>                 Key: HDDS-8940
>                 URL: https://issues.apache.org/jira/browse/HDDS-8940
>             Project: Apache Ozone
>          Issue Type: Sub-task
>            Reporter: Hemant Kumar
>            Assignee: Hemant Kumar
>            Priority: Major
>         Attachments: HDDS-8940_Compaction_Dag.png, 
> HDDS-8940_Compaction_Dag_1.png, HDDS-8940_Compaction_Dag_2.png, example.png
>
>
> While running snapDiff, we are seeing SST files missing on optimized snapDiff 
> path.
> {code}
> 2023-06-23 19:59:16,323 [snapshot-diff-job-thread-id-14] ERROR 
> org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager: Caught checked 
> exception during diff report generation for volume: volume1 bucket: bucket1, 
> fromSnapshot: alma2 and toSnapshot: 
> cm-tmp-0ae3d532-237d-4df2-83f9-4844d153521e
> java.io.FileNotFoundException: Can't find SST file: 010788.sst
> at 
> org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer.getAbsoluteSstFilePath(RocksDBCheckpointDiffer.java:654)
> at 
> org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer.filterRelevantSstFilesFullPath(RocksDBCheckpointDiffer.java:949)
> at 
> org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer.getSSTDiffList(RocksDBCheckpointDiffer.java:933)
> at 
> org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer.getSSTDiffListWithFullPath(RocksDBCheckpointDiffer.java:868)
> at 
> org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager.getDeltaFiles(SnapshotDiffManager.java:929)
> at 
> org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager.getDeltaFilesAndDiffKeysToObjectIdToKeyMap(SnapshotDiffManager.java:793)
> at 
> org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager.generateSnapshotDiffReport(SnapshotDiffManager.java:721)
> at 
> org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager.lambda$0(SnapshotDiffManager.java:565)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to