[ 
https://issues.apache.org/jira/browse/HDDS-8940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17754810#comment-17754810
 ] 

Hemant Kumar edited comment on HDDS-8940 at 8/16/23 5:52 PM:
-------------------------------------------------------------

There is another occurrence of this issue.

Important logs.

{code}
...
2023-08-14 19:29:45,520 INFO [OM StateMachine ApplyTransaction Thread - 
0]-org.apache.hadoop.ozone.om.request.snapshot.OMSnapshotCreateRequest: Created 
snapshot: 'cm-46-1692041157584-0' with snapshotId: 
'c2ae0246-626b-4a09-b1c8-0e76997c26ee' under path 'vol23io/bucket870io'
...
...
2023-08-14 19:30:25,363 INFO 
[SstFilteringService#0]-org.apache.hadoop.hdds.utils.db.RocksDatabase: Deleting 
sst file /001319.sst corresponding to column family keyTable from db: 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-c2ae0246-626b-4a09-b1c8-0e76997c26ee
...
...
2023-08-14 19:43:19,498 INFO 
[CompactionDagPruningService]-org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer:
 Removing SST files: [000971, 000210, 000297, 001386, 000457, 000611, 000699, 
000974, 000214, 000973, 001025, 001026, 001389, 000615, 000736, 000856, 000619, 
000618, 001393, 001273, 001153, 001395, 000443, 001137, 000964, 001134, 000203, 
000687, 001377, 000323, 000565, 001015, 000968, 000846, 000206, 000569, 000602, 
000848, 001021, 000450, 000570, 000172, 000476, 001049, 000992, 000111, 000110, 
000594, 000513, 001287, 001320, 001200, 001168, 000356, 000516, 000757, 000911, 
001328, 000519, 000915, 001209, 000918, 001296, 001290, 000481, 001172, 001293, 
000102, 000465, 000860, 000468, 000588, 000863, 000620, 001312, 000748, 000109, 
000505, 000229, 001319, 000628, 000908, 001284, 000190, 000472, 000350, 000471, 
000194, 001346, 000651, 000375, 000253, 000374, 000891, 000777, 001343, 000258, 
000412, 000775, 000418, 000814, 000811, 000899, 000818, 001229, 000817, 000936, 
000819, 001075, 000262, 000140, 001072, 000487, 000883, 000244, 000364, 000760, 
000766, 000920, 001331, 000127, 000248, 000765, 000522, 000367, 000763, 000406, 
000921, 001339, 000409, 000808, 000929, 001063, 001184, 001186, 001220, 001180, 
000370, 000157, 001005, 001247, 000277, 000310, 000552, 001127, 000672, 001007, 
001129, 000315, 000711, 000799, 000953, 000314, 000159, 000555, 001124, 000715, 
000834, 000717, 000958, 001090, 001250, 001010, 001131, 000285, 000663, 000541, 
000661, 001238, 000143, 000264, 000425, 000667, 001078, 001232, 001113, 000422, 
001235, 000306, 000702, 000706, 001120, 001242, 000670, 000152, 001082, 000392] 
as part of SST file pruning.
...
...
2023-08-15 17:07:18,444 INFO [OM StateMachine ApplyTransaction Thread - 
0]-org.apache.hadoop.ozone.om.request.snapshot.OMSnapshotCreateRequest: Created 
snapshot: 'cm-tmp-2651cc03-a0e7-4fba-ab6c-810a863f4073' with snapshotId: 
'002cfe90-0684-4dd3-9239-487a92067fa6' under path 'vol23io/bucket870io'
...
...
2023-08-15 17:07:41,334 INFO 
[snapshot-diff-job-thread-id-2]-org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager:
 Started snap diff report generation for volume: 'vol23io', bucket: 
'bucket870io', fromSnapshot: 'cm-46-1692041157584-0', toSnapshot: 
'cm-tmp-2651cc03-a0e7-4fba-ab6c-810a863f4073'
2023-08-15 17:07:41,334 INFO 
[snapshot-diff-job-thread-id-2]-org.apache.hadoop.ozone.om.snapshot.SnapshotCache:
 Loading snapshot. Table key: /vol23io/bucket870io/cm-46-1692041157584-0
2023-08-15 17:07:41,335 INFO 
[snapshot-diff-job-thread-id-2]-org.apache.hadoop.ozone.om.helpers.OmKeyInfo: 
OmKeyInfo.getCodec ignorePipeline = true
2023-08-15 17:07:41,371 INFO 
[snapshot-diff-job-thread-id-2]-org.apache.hadoop.ozone.om.snapshot.SnapshotCache:
 Loading snapshot. Table key: 
/vol23io/bucket870io/cm-tmp-2651cc03-a0e7-4fba-ab6c-810a863f4073
2023-08-15 17:07:41,371 INFO 
[snapshot-diff-job-thread-id-2]-org.apache.hadoop.ozone.om.helpers.OmKeyInfo: 
OmKeyInfo.getCodec ignorePipeline = true
2023-08-15 17:07:41,438 WARN 
[snapshot-diff-job-thread-id-2]-org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager:
 Failed to get SST diff file using RocksDBCheckpointDiffer. It will fallback to 
full diff now.
java.io.FileNotFoundException: Can't find SST file: 001319.sst
        at 
org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer.getAbsoluteSstFilePath(RocksDBCheckpointDiffer.java:688)
        at 
org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer.filterRelevantSstFilesFullPath(RocksDBCheckpointDiffer.java:954)
        at 
org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer.getSSTDiffList(RocksDBCheckpointDiffer.java:938)
        at 
org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer.getSSTDiffListWithFullPath(RocksDBCheckpointDiffer.java:875)
        at 
org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager.getDeltaFiles(SnapshotDiffManager.java:1237)
        at 
org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager.getDeltaFilesAndDiffKeysToObjectIdToKeyMap(SnapshotDiffManager.java:1067)
        at 
org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager.lambda$3(SnapshotDiffManager.java:949)
        at 
org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager.generateSnapshotDiffReport(SnapshotDiffManager.java:1015)
        at 
org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager.lambda$2(SnapshotDiffManager.java:742)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:834)
2023-08-15 17:07:41,439 WARN 
[snapshot-diff-job-thread-id-2]-org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager:
 RocksDBCheckpointDiffer is not available, falling back to slow path
2023-08-15 17:07:41,522 WARN 
[snapshot-diff-job-thread-id-2]-org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager:
 Failed to get SST diff file using RocksDBCheckpointDiffer. It will fallback to 
full diff now.
java.io.FileNotFoundException: Can't find SST file: 001319.sst
        at 
org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer.getAbsoluteSstFilePath(RocksDBCheckpointDiffer.java:688)
        at 
org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer.filterRelevantSstFilesFullPath(RocksDBCheckpointDiffer.java:954)
        at 
org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer.getSSTDiffList(RocksDBCheckpointDiffer.java:938)
        at 
org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer.getSSTDiffListWithFullPath(RocksDBCheckpointDiffer.java:875)
        at 
org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager.getDeltaFiles(SnapshotDiffManager.java:1237)
        at 
org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager.getDeltaFilesAndDiffKeysToObjectIdToKeyMap(SnapshotDiffManager.java:1067)
        at 
org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager.lambda$4(SnapshotDiffManager.java:959)
        at 
org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager.generateSnapshotDiffReport(SnapshotDiffManager.java:1015)
        at 
org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager.lambda$2(SnapshotDiffManager.java:742)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:834)
2023-08-15 17:07:41,522 WARN 
[snapshot-diff-job-thread-id-2]-org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager:
 RocksDBCheckpointDiffer is not available, falling back to slow path
2023-08-15 17:07:41,531 INFO 
[snapshot-diff-job-thread-id-2]-org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager:
 Starting diff report generation for jobId: 
2239d0e4-86ff-48f9-9919-08a6b8e5ad7f.
...
{code}

Order of some events based on compaction logs:

{code}
...
0000000000000014065.log:C 14195 001317,001296:001319
...
0000000000000014414.log:S 14582 c2ae0246-626b-4a09-b1c8-0e76997c26ee 
1692041385506
...
0000000000000015641.log:C 15645 001444,001319:001447
...
0000000000000015676.log:C 15854 001468,001447:001470
...
0000000000000017844.log:C 18539 001618,001596,001470:001620
...
0000000000000035771.log:S 35841 002cfe90-0684-4dd3-9239-487a92067fa6 
1692119238440
...
0000000000000036533.log:C 36537 001950,001620:001953
...
{code}

Files in from snapshots:
{code}
[root@quasar-dbrrnu-1 checkpointState]# ls 
om.db-c2ae0246-626b-4a09-b1c8-0e76997c26ee
000065.sst  000128.sst  000192.sst  000234.sst  000289.sst  000347.sst  
000378.sst  000481.sst  000577.sst  000666.sst  000775.sst  000883.sst  
000968.sst  001020.sst  001047.sst  001093.sst  001141.sst  001209.sst  
001247.sst  001276.sst  001298.sst  001323.sst  001336.sst  001345.sst  
001353.sst  001358.sst  CURRENT                   LOG.old.1692119261337458
000076.sst  000140.sst  000206.sst  000244.sst  000292.sst  000353.sst  
000406.sst  000516.sst  000588.sst  000674.sst  000779.sst  000886.sst  
000977.sst  001033.sst  001049.sst  001103.sst  001156.sst  001212.sst  
001250.sst  001277.sst  001307.sst  001325.sst  001338.sst  001346.sst  
001354.sst  001361.log  IDENTITY                  MANIFEST-001372
000078.sst  000153.sst  000209.sst  000250.sst  000307.sst  000367.sst  
000443.sst  000525.sst  000597.sst  000699.sst  000811.sst  000894.sst  
000991.sst  001036.sst  001053.sst  001127.sst  001158.sst  001215.sst  
001251.sst  001287.sst  001312.sst  001328.sst  001339.sst  001349.sst  
001355.sst  001367.log  LOCK                      OPTIONS-001370
000093.sst  000166.sst  000212.sst  000284.sst  000323.sst  000370.sst  
000446.sst  000552.sst  000628.sst  000736.sst  000821.sst  000913.sst  
001005.sst  001038.sst  001058.sst  001129.sst  001168.sst  001222.sst  
001254.sst  001289.sst  001315.sst  001331.sst  001340.sst  001351.sst  
001356.sst  001371.log  LOG                       OPTIONS-001374
000102.sst  000172.sst  000216.sst  000285.sst  000330.sst  000371.sst  
000452.sst  000556.sst  000663.sst  000743.sst  000846.sst  000929.sst  
001009.sst  001039.sst  001090.sst  001130.sst  001173.sst  001230.sst  
001274.sst  001292.sst  001320.sst  001333.sst  001343.sst  001352.sst  
001357.sst  archive     LOG.old.1692117133816520
{code}

To Snapshot doesn't exist because it was deleted after the snapshot diff job.

Files deleted by SST filtering service on from snapshot:
{code}

2023-08-14 19:30:25,363 INFO 
[SstFilteringService#0]-org.apache.hadoop.hdds.utils.db.RocksDatabase: Deleting 
sst file /001319.sst corresponding to column family keyTable from db: 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-c2ae0246-626b-4a09-b1c8-0e76997c26ee
2023-08-14 19:30:25,365 INFO 
[SstFilteringService#0]-org.apache.hadoop.hdds.utils.db.managed.ManagedRocksObjectUtils:
 Waited for 0 milliseconds for file 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-c2ae0246-626b-4a09-b1c8-0e76997c26ee/001319.sst
 deletion.
2023-08-14 19:30:25,366 INFO 
[SstFilteringService#0]-org.apache.hadoop.hdds.utils.db.RocksDatabase: Deleting 
sst file /000669.sst corresponding to column family keyTable from db: 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-c2ae0246-626b-4a09-b1c8-0e76997c26ee
2023-08-14 19:30:25,367 INFO 
[SstFilteringService#0]-org.apache.hadoop.hdds.utils.db.managed.ManagedRocksObjectUtils:
 Waited for 0 milliseconds for file 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-c2ae0246-626b-4a09-b1c8-0e76997c26ee/000669.sst
 deletion.
2023-08-14 19:30:25,369 INFO 
[SstFilteringService#0]-org.apache.hadoop.hdds.utils.db.RocksDatabase: Deleting 
sst file /001015.sst corresponding to column family keyTable from db: 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-c2ae0246-626b-4a09-b1c8-0e76997c26ee
2023-08-14 19:30:25,370 INFO 
[SstFilteringService#0]-org.apache.hadoop.hdds.utils.db.managed.ManagedRocksObjectUtils:
 Waited for 0 milliseconds for file 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-c2ae0246-626b-4a09-b1c8-0e76997c26ee/001015.sst
 deletion.
2023-08-14 19:30:25,370 INFO 
[SstFilteringService#0]-org.apache.hadoop.hdds.utils.db.RocksDatabase: Deleting 
sst file /000781.sst corresponding to column family keyTable from db: 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-c2ae0246-626b-4a09-b1c8-0e76997c26ee
2023-08-14 19:30:25,372 INFO 
[SstFilteringService#0]-org.apache.hadoop.hdds.utils.db.managed.ManagedRocksObjectUtils:
 Waited for 0 milliseconds for file 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-c2ae0246-626b-4a09-b1c8-0e76997c26ee/000781.sst
 deletion.
2023-08-14 19:30:25,373 INFO 
[SstFilteringService#0]-org.apache.hadoop.hdds.utils.db.RocksDatabase: Deleting 
sst file /000558.sst corresponding to column family keyTable from db: 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-c2ae0246-626b-4a09-b1c8-0e76997c26ee
2023-08-14 19:30:25,374 INFO 
[SstFilteringService#0]-org.apache.hadoop.hdds.utils.db.managed.ManagedRocksObjectUtils:
 Waited for 0 milliseconds for file 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-c2ae0246-626b-4a09-b1c8-0e76997c26ee/000558.sst
 deletion.
2023-08-14 19:30:25,377 INFO 
[SstFilteringService#0]-org.apache.hadoop.hdds.utils.db.RocksDatabase: Deleting 
sst file /001013.sst corresponding to column family keyTable from db: 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-c2ae0246-626b-4a09-b1c8-0e76997c26ee
2023-08-14 19:30:25,378 INFO 
[SstFilteringService#0]-org.apache.hadoop.hdds.utils.db.managed.ManagedRocksObjectUtils:
 Waited for 0 milliseconds for file 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-c2ae0246-626b-4a09-b1c8-0e76997c26ee/001013.sst
 deletion.
{code}


Nothing was deleted by SST filtering service for to Snapshot:
{code}
[root@quasar-dbrrnu-1 compaction-log]# grep 
002cfe90-0684-4dd3-9239-487a92067fa6 /var/log/hadoop-ozone/ozone-om.log
2023-08-15 17:07:18,444 INFO [OM StateMachine ApplyTransaction Thread - 
0]-org.apache.hadoop.ozone.om.request.snapshot.OMSnapshotCreateRequest: Created 
snapshot: 'cm-tmp-2651cc03-a0e7-4fba-ab6c-810a863f4073' with snapshotId: 
'002cfe90-0684-4dd3-9239-487a92067fa6' under path 'vol23io/bucket870io'
2023-08-15 17:07:18,463 INFO 
[OMDoubleBufferFlushThread]-org.apache.hadoop.hdds.utils.db.RDBCheckpointManager:
 Created checkpoint in rocksDB at 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-002cfe90-0684-4dd3-9239-487a92067fa6
 in 18 milliseconds
2023-08-15 17:07:18,465 INFO 
[OMDoubleBufferFlushThread]-org.apache.hadoop.hdds.utils.db.RDBCheckpointUtils: 
Waited for 0 milliseconds for checkpoint directory 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-002cfe90-0684-4dd3-9239-487a92067fa6
 availability.
2023-08-15 17:07:18,466 INFO 
[OMDoubleBufferFlushThread]-org.apache.hadoop.ozone.om.OmSnapshotManager: 
Created checkpoint : 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-002cfe90-0684-4dd3-9239-487a92067fa6
 for snapshot cm-tmp-2651cc03-a0e7-4fba-ab6c-810a863f4073
2023-08-16 00:24:55,619 DEBUG 
[main]-org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer: Processing line: S 
35841 002cfe90-0684-4dd3-9239-487a92067fa6 1692119238440
{code}


was (Author: JIRAUSER297350):
There is another occurrence of this issue.

Important logs.

{code}
...
2023-08-14 19:29:45,520 INFO [OM StateMachine ApplyTransaction Thread - 
0]-org.apache.hadoop.ozone.om.request.snapshot.OMSnapshotCreateRequest: Created 
snapshot: 'cm-46-1692041157584-0' with snapshotId: 
'c2ae0246-626b-4a09-b1c8-0e76997c26ee' under path 'vol23io/bucket870io'
...
...
2023-08-14 19:30:25,363 INFO 
[SstFilteringService#0]-org.apache.hadoop.hdds.utils.db.RocksDatabase: Deleting 
sst file /001319.sst corresponding to column family keyTable from db: 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-c2ae0246-626b-4a09-b1c8-0e76997c26ee
...
...
2023-08-14 19:43:19,498 INFO 
[CompactionDagPruningService]-org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer:
 Removing SST files: [000971, 000210, 000297, 001386, 000457, 000611, 000699, 
000974, 000214, 000973, 001025, 001026, 001389, 000615, 000736, 000856, 000619, 
000618, 001393, 001273, 001153, 001395, 000443, 001137, 000964, 001134, 000203, 
000687, 001377, 000323, 000565, 001015, 000968, 000846, 000206, 000569, 000602, 
000848, 001021, 000450, 000570, 000172, 000476, 001049, 000992, 000111, 000110, 
000594, 000513, 001287, 001320, 001200, 001168, 000356, 000516, 000757, 000911, 
001328, 000519, 000915, 001209, 000918, 001296, 001290, 000481, 001172, 001293, 
000102, 000465, 000860, 000468, 000588, 000863, 000620, 001312, 000748, 000109, 
000505, 000229, 001319, 000628, 000908, 001284, 000190, 000472, 000350, 000471, 
000194, 001346, 000651, 000375, 000253, 000374, 000891, 000777, 001343, 000258, 
000412, 000775, 000418, 000814, 000811, 000899, 000818, 001229, 000817, 000936, 
000819, 001075, 000262, 000140, 001072, 000487, 000883, 000244, 000364, 000760, 
000766, 000920, 001331, 000127, 000248, 000765, 000522, 000367, 000763, 000406, 
000921, 001339, 000409, 000808, 000929, 001063, 001184, 001186, 001220, 001180, 
000370, 000157, 001005, 001247, 000277, 000310, 000552, 001127, 000672, 001007, 
001129, 000315, 000711, 000799, 000953, 000314, 000159, 000555, 001124, 000715, 
000834, 000717, 000958, 001090, 001250, 001010, 001131, 000285, 000663, 000541, 
000661, 001238, 000143, 000264, 000425, 000667, 001078, 001232, 001113, 000422, 
001235, 000306, 000702, 000706, 001120, 001242, 000670, 000152, 001082, 000392] 
as part of SST file pruning.
...
...
2023-08-15 17:07:18,444 INFO [OM StateMachine ApplyTransaction Thread - 
0]-org.apache.hadoop.ozone.om.request.snapshot.OMSnapshotCreateRequest: Created 
snapshot: 'cm-tmp-2651cc03-a0e7-4fba-ab6c-810a863f4073' with snapshotId: 
'002cfe90-0684-4dd3-9239-487a92067fa6' under path 'vol23io/bucket870io'
...
...
2023-08-15 17:07:41,334 INFO 
[snapshot-diff-job-thread-id-2]-org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager:
 Started snap diff report generation for volume: 'vol23io', bucket: 
'bucket870io', fromSnapshot: 'cm-46-1692041157584-0', toSnapshot: 
'cm-tmp-2651cc03-a0e7-4fba-ab6c-810a863f4073'
2023-08-15 17:07:41,334 INFO 
[snapshot-diff-job-thread-id-2]-org.apache.hadoop.ozone.om.snapshot.SnapshotCache:
 Loading snapshot. Table key: /vol23io/bucket870io/cm-46-1692041157584-0
2023-08-15 17:07:41,335 INFO 
[snapshot-diff-job-thread-id-2]-org.apache.hadoop.ozone.om.helpers.OmKeyInfo: 
OmKeyInfo.getCodec ignorePipeline = true
2023-08-15 17:07:41,371 INFO 
[snapshot-diff-job-thread-id-2]-org.apache.hadoop.ozone.om.snapshot.SnapshotCache:
 Loading snapshot. Table key: 
/vol23io/bucket870io/cm-tmp-2651cc03-a0e7-4fba-ab6c-810a863f4073
2023-08-15 17:07:41,371 INFO 
[snapshot-diff-job-thread-id-2]-org.apache.hadoop.ozone.om.helpers.OmKeyInfo: 
OmKeyInfo.getCodec ignorePipeline = true
2023-08-15 17:07:41,438 WARN 
[snapshot-diff-job-thread-id-2]-org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager:
 Failed to get SST diff file using RocksDBCheckpointDiffer. It will fallback to 
full diff now.
java.io.FileNotFoundException: Can't find SST file: 001319.sst
        at 
org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer.getAbsoluteSstFilePath(RocksDBCheckpointDiffer.java:688)
        at 
org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer.filterRelevantSstFilesFullPath(RocksDBCheckpointDiffer.java:954)
        at 
org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer.getSSTDiffList(RocksDBCheckpointDiffer.java:938)
        at 
org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer.getSSTDiffListWithFullPath(RocksDBCheckpointDiffer.java:875)
        at 
org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager.getDeltaFiles(SnapshotDiffManager.java:1237)
        at 
org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager.getDeltaFilesAndDiffKeysToObjectIdToKeyMap(SnapshotDiffManager.java:1067)
        at 
org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager.lambda$3(SnapshotDiffManager.java:949)
        at 
org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager.generateSnapshotDiffReport(SnapshotDiffManager.java:1015)
        at 
org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager.lambda$2(SnapshotDiffManager.java:742)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:834)
2023-08-15 17:07:41,439 WARN 
[snapshot-diff-job-thread-id-2]-org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager:
 RocksDBCheckpointDiffer is not available, falling back to slow path
2023-08-15 17:07:41,522 WARN 
[snapshot-diff-job-thread-id-2]-org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager:
 Failed to get SST diff file using RocksDBCheckpointDiffer. It will fallback to 
full diff now.
java.io.FileNotFoundException: Can't find SST file: 001319.sst
        at 
org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer.getAbsoluteSstFilePath(RocksDBCheckpointDiffer.java:688)
        at 
org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer.filterRelevantSstFilesFullPath(RocksDBCheckpointDiffer.java:954)
        at 
org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer.getSSTDiffList(RocksDBCheckpointDiffer.java:938)
        at 
org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer.getSSTDiffListWithFullPath(RocksDBCheckpointDiffer.java:875)
        at 
org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager.getDeltaFiles(SnapshotDiffManager.java:1237)
        at 
org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager.getDeltaFilesAndDiffKeysToObjectIdToKeyMap(SnapshotDiffManager.java:1067)
        at 
org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager.lambda$4(SnapshotDiffManager.java:959)
        at 
org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager.generateSnapshotDiffReport(SnapshotDiffManager.java:1015)
        at 
org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager.lambda$2(SnapshotDiffManager.java:742)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:834)
2023-08-15 17:07:41,522 WARN 
[snapshot-diff-job-thread-id-2]-org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager:
 RocksDBCheckpointDiffer is not available, falling back to slow path
2023-08-15 17:07:41,531 INFO 
[snapshot-diff-job-thread-id-2]-org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager:
 Starting diff report generation for jobId: 
2239d0e4-86ff-48f9-9919-08a6b8e5ad7f.
...
{code}

Order of some events based on compaction logs:

{code}
...
0000000000000014065.log:C 14195 001317,001296:001319
...
0000000000000014414.log:S 14582 c2ae0246-626b-4a09-b1c8-0e76997c26ee 
1692041385506
...
0000000000000015641.log:C 15645 001444,001319:001447
...
0000000000000015676.log:C 15854 001468,001447:001470
...
0000000000000017844.log:C 18539 001618,001596,001470:001620
...
0000000000000035771.log:S 35841 002cfe90-0684-4dd3-9239-487a92067fa6 
1692119238440
...
0000000000000036533.log:C 36537 001950,001620:001953
...
{code}

Files in from snapshots:
{code}
[root@quasar-dbrrnu-1 checkpointState]# ls 
om.db-c2ae0246-626b-4a09-b1c8-0e76997c26ee
000065.sst  000128.sst  000192.sst  000234.sst  000289.sst  000347.sst  
000378.sst  000481.sst  000577.sst  000666.sst  000775.sst  000883.sst  
000968.sst  001020.sst  001047.sst  001093.sst  001141.sst  001209.sst  
001247.sst  001276.sst  001298.sst  001323.sst  001336.sst  001345.sst  
001353.sst  001358.sst  CURRENT                   LOG.old.1692119261337458
000076.sst  000140.sst  000206.sst  000244.sst  000292.sst  000353.sst  
000406.sst  000516.sst  000588.sst  000674.sst  000779.sst  000886.sst  
000977.sst  001033.sst  001049.sst  001103.sst  001156.sst  001212.sst  
001250.sst  001277.sst  001307.sst  001325.sst  001338.sst  001346.sst  
001354.sst  001361.log  IDENTITY                  MANIFEST-001372
000078.sst  000153.sst  000209.sst  000250.sst  000307.sst  000367.sst  
000443.sst  000525.sst  000597.sst  000699.sst  000811.sst  000894.sst  
000991.sst  001036.sst  001053.sst  001127.sst  001158.sst  001215.sst  
001251.sst  001287.sst  001312.sst  001328.sst  001339.sst  001349.sst  
001355.sst  001367.log  LOCK                      OPTIONS-001370
000093.sst  000166.sst  000212.sst  000284.sst  000323.sst  000370.sst  
000446.sst  000552.sst  000628.sst  000736.sst  000821.sst  000913.sst  
001005.sst  001038.sst  001058.sst  001129.sst  001168.sst  001222.sst  
001254.sst  001289.sst  001315.sst  001331.sst  001340.sst  001351.sst  
001356.sst  001371.log  LOG                       OPTIONS-001374
000102.sst  000172.sst  000216.sst  000285.sst  000330.sst  000371.sst  
000452.sst  000556.sst  000663.sst  000743.sst  000846.sst  000929.sst  
001009.sst  001039.sst  001090.sst  001130.sst  001173.sst  001230.sst  
001274.sst  001292.sst  001320.sst  001333.sst  001343.sst  001352.sst  
001357.sst  archive     LOG.old.1692117133816520
{code}

To Snapshot doesn't exist because it was deleted after the snapshot diff job.

Files deleted by SST filtering service on from snapshot:
{code}

2023-08-14 19:30:25,363 INFO 
[SstFilteringService#0]-org.apache.hadoop.hdds.utils.db.RocksDatabase: Deleting 
sst file /001319.sst corresponding to column family keyTable from db: 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-c2ae0246-626b-4a09-b1c8-0e76997c26ee
2023-08-14 19:30:25,365 INFO 
[SstFilteringService#0]-org.apache.hadoop.hdds.utils.db.managed.ManagedRocksObjectUtils:
 Waited for 0 milliseconds for file 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-c2ae0246-626b-4a09-b1c8-0e76997c26ee/001319.sst
 deletion.
2023-08-14 19:30:25,366 INFO 
[SstFilteringService#0]-org.apache.hadoop.hdds.utils.db.RocksDatabase: Deleting 
sst file /000669.sst corresponding to column family keyTable from db: 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-c2ae0246-626b-4a09-b1c8-0e76997c26ee
2023-08-14 19:30:25,367 INFO 
[SstFilteringService#0]-org.apache.hadoop.hdds.utils.db.managed.ManagedRocksObjectUtils:
 Waited for 0 milliseconds for file 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-c2ae0246-626b-4a09-b1c8-0e76997c26ee/000669.sst
 deletion.
2023-08-14 19:30:25,369 INFO 
[SstFilteringService#0]-org.apache.hadoop.hdds.utils.db.RocksDatabase: Deleting 
sst file /001015.sst corresponding to column family keyTable from db: 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-c2ae0246-626b-4a09-b1c8-0e76997c26ee
2023-08-14 19:30:25,370 INFO 
[SstFilteringService#0]-org.apache.hadoop.hdds.utils.db.managed.ManagedRocksObjectUtils:
 Waited for 0 milliseconds for file 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-c2ae0246-626b-4a09-b1c8-0e76997c26ee/001015.sst
 deletion.
2023-08-14 19:30:25,370 INFO 
[SstFilteringService#0]-org.apache.hadoop.hdds.utils.db.RocksDatabase: Deleting 
sst file /000781.sst corresponding to column family keyTable from db: 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-c2ae0246-626b-4a09-b1c8-0e76997c26ee
2023-08-14 19:30:25,372 INFO 
[SstFilteringService#0]-org.apache.hadoop.hdds.utils.db.managed.ManagedRocksObjectUtils:
 Waited for 0 milliseconds for file 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-c2ae0246-626b-4a09-b1c8-0e76997c26ee/000781.sst
 deletion.
2023-08-14 19:30:25,373 INFO 
[SstFilteringService#0]-org.apache.hadoop.hdds.utils.db.RocksDatabase: Deleting 
sst file /000558.sst corresponding to column family keyTable from db: 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-c2ae0246-626b-4a09-b1c8-0e76997c26ee
2023-08-14 19:30:25,374 INFO 
[SstFilteringService#0]-org.apache.hadoop.hdds.utils.db.managed.ManagedRocksObjectUtils:
 Waited for 0 milliseconds for file 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-c2ae0246-626b-4a09-b1c8-0e76997c26ee/000558.sst
 deletion.
2023-08-14 19:30:25,377 INFO 
[SstFilteringService#0]-org.apache.hadoop.hdds.utils.db.RocksDatabase: Deleting 
sst file /001013.sst corresponding to column family keyTable from db: 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-c2ae0246-626b-4a09-b1c8-0e76997c26ee
2023-08-14 19:30:25,378 INFO 
[SstFilteringService#0]-org.apache.hadoop.hdds.utils.db.managed.ManagedRocksObjectUtils:
 Waited for 0 milliseconds for file 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-c2ae0246-626b-4a09-b1c8-0e76997c26ee/001013.sst
 deletion.
{code}

> SST files are missing on optimized snapDiff path.
> -------------------------------------------------
>
>                 Key: HDDS-8940
>                 URL: https://issues.apache.org/jira/browse/HDDS-8940
>             Project: Apache Ozone
>          Issue Type: Sub-task
>            Reporter: Hemant Kumar
>            Assignee: Hemant Kumar
>            Priority: Major
>         Attachments: HDDS-8940_Compaction_Dag.png, 
> HDDS-8940_Compaction_Dag_1.png
>
>
> While running snapDiff, we are seeing SST files missing on optimized snapDiff 
> path.
> {code}
> 2023-06-23 19:59:16,323 [snapshot-diff-job-thread-id-14] ERROR 
> org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager: Caught checked 
> exception during diff report generation for volume: volume1 bucket: bucket1, 
> fromSnapshot: alma2 and toSnapshot: 
> cm-tmp-0ae3d532-237d-4df2-83f9-4844d153521e
> java.io.FileNotFoundException: Can't find SST file: 010788.sst
> at 
> org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer.getAbsoluteSstFilePath(RocksDBCheckpointDiffer.java:654)
> at 
> org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer.filterRelevantSstFilesFullPath(RocksDBCheckpointDiffer.java:949)
> at 
> org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer.getSSTDiffList(RocksDBCheckpointDiffer.java:933)
> at 
> org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer.getSSTDiffListWithFullPath(RocksDBCheckpointDiffer.java:868)
> at 
> org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager.getDeltaFiles(SnapshotDiffManager.java:929)
> at 
> org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager.getDeltaFilesAndDiffKeysToObjectIdToKeyMap(SnapshotDiffManager.java:793)
> at 
> org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager.generateSnapshotDiffReport(SnapshotDiffManager.java:721)
> at 
> org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager.lambda$0(SnapshotDiffManager.java:565)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to