Thomas Mueller created OAK-8991:
-----------------------------------

             Summary: MarkSweepGarbageCollector: repeated warnings for files 
that don't exist
                 Key: OAK-8991
                 URL: https://issues.apache.org/jira/browse/OAK-8991
             Project: Jackrabbit Oak
          Issue Type: Improvement
          Components: blob
            Reporter: Thomas Mueller


When using the MarkSweepGarbageCollector (using for example a file data store), 
if the blob id file (from the BlobIdTracker) contains records that don't exist 
in the datastore, then a warning is logged when trying to remove the 
(unreferenced) file:

 
{noformat}
*WARN* org.apache.jackrabbit.oak.plugins.blob.MarkSweepGarbageCollector Error 
occurred while deleting blob with id [...]
org.apache.jackrabbit.core.data.DataStoreException: Record ... does not exist
        at 
org.apache.jackrabbit.core.data.AbstractDataStore.getRecord(AbstractDataStore.java:59)
 [org.apache.jackrabbit.jackrabbit-data:2.16.3]
        at 
org.apache.jackrabbit.oak.plugins.blob.datastore.OakFileDataStore.getRecordForId(OakFileDataStore.java:259)
 [org.apache.jackrabbit.oak-blob-plugins:1.8.9]
        at 
org.apache.jackrabbit.oak.plugins.blob.datastore.DataStoreBlobStore.getRecordForId(DataStoreBlobStore.java:520)
 [org.apache.jackrabbit.oak-blob-plugins:1.8.9]
        at 
org.apache.jackrabbit.oak.plugins.blob.datastore.DataStoreBlobStore.countDeleteChunks(DataStoreBlobStore.java:426)
 [org.apache.jackrabbit.oak-blob-plugins:1.8.9]
        at 
org.apache.jackrabbit.oak.plugins.blob.MarkSweepGarbageCollector$BlobCollectionType.sweepInternal(MarkSweepGarbageCollector.java:859)
 [org.apache.jackrabbit.oak-blob-plugins:1.8.9]
        at 
org.apache.jackrabbit.oak.plugins.blob.MarkSweepGarbageCollector.sweep(MarkSweepGarbageCollector.java:423)
 [org.apache.jackrabbit.oak-blob-plugins:1.8.9]
        at 
org.apache.jackrabbit.oak.plugins.blob.MarkSweepGarbageCollector.markAndSweep(MarkSweepGarbageCollector.java:287)
 [org.apache.jackrabbit.oak-blob-plugins:1.8.9]
        at 
org.apache.jackrabbit.oak.plugins.blob.MarkSweepGarbageCollector.collectGarbage(MarkSweepGarbageCollector.java:194)
 [org.apache.jackrabbit.oak-blob-plugins:1.8.9]
{noformat}

That means it tried to remove a file that doesn't exist.
This indicates a problem in the process; for example, the blob id tracker 
file(s) was/were restored from an older backup. (Possibly there are other cases 
how this could occur).

Now, the next time the garbage collection is run, the same files will try to be 
removed, and that again fails.

It would be better if the files that don't exist are removed from the blob id 
tracker file, so that they are not tried to be removed later again and again.

If the blob id tracker file(s) are incorrect, I think it would be better to 
delete and rebuild them, otherwise some of the unreferenced binaries will never 
be removed. Possibly a warning should be logged, with instructions on how to 
rebuild these files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to