Wim Symons created OAK-8170: ------------------------------- Summary: oak-run datastorecheck and online consistency check falsely report missing blobs Key: OAK-8170 URL: https://issues.apache.org/jira/browse/OAK-8170 Project: Jackrabbit Oak Issue Type: Bug Components: segment-tar Affects Versions: 1.8.9 Reporter: Wim Symons Attachments: output.txt
Hi, We found that oak-run datastorecheck falsely reports missing blobs when running datastorecheck without the --verbose option. Even the online datastore consistency check falsely reports the same missing blobs. This is related due to the fact that the standard blob reference collector in oak-run datastorecheck looks at *all* compaction generations in the segment store instead of only the last one. After running an offline compaction, and thus keeping only 1 generation, the correct number of blob references and missing blobs is reported by oak-run datastorecheck. The bug on the 1.8 branch comes from org.apache.jackrabbit.oak.plugins.blob.BlobReferenceRetriever#collectReferences (line 429) and by following that you arrive at org.apache.jackrabbit.oak.segment.file.FileStore#tarFiles (line 1013) stating: tarFiles.collectBlobReferences(collector, newOldReclaimer(lastCompactionType, getGcGeneration(), gcOptions.getRetainedGenerations())); I'm not familiar enough with this source code, so I won't attempt adding a patch. I did double-check trunk and saw the same line of code there: org.apache.jackrabbit.oak.segment.file.GarbageCollector#collectBlobReferences (line 324). I attached a text file with the outputs of the commands I ran. We currently use Oak 1.8.9 using AEM 6.4.3.0 and oak-blob-cloud 1.8.9 from the 1.8.3 AEM S3 connector. Regards Wim -- This message was sent by Atlassian JIRA (v7.6.3#76005)