[ 
https://issues.apache.org/jira/browse/OAK-9765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17534722#comment-17534722
 ] 

Piercarlo Slavazza edited comment on OAK-9765 at 5/11/22 7:09 AM:
------------------------------------------------------------------

[~amitj] sorry, my bad, I added the compaction in wrong place in the code - 
fixed.

Actually compaction did the trick, despite the fact that I still add just one 
10MB blob.

However, the GC fails anyway because of this condition:

[https://github.com/apache/jackrabbit-oak/blob/5ac5b177cba9f4e40641d69fed986cd420b6a17b/oak-blob-plugins/src/main/java/org/apache/jackrabbit/oak/plugins/blob/MarkSweepGarbageCollector.java#L933]

That is:
{code:java}
if (!fs.getMarkedRefs().exists() || fs.getMarkedRefs().length() == 0) {{code}
When the GC is called at the of the program, no referenced blobs are present, 
and so {{fs.getMarkedRefs().length()}} is zero, and thefore an Exception is 
raised:
{code:java}
Exception in thread "main" java.io.IOException: Marked references not available
    at 
org.apache.jackrabbit.oak.plugins.blob.MarkSweepGarbageCollector$GarbageCollectionType.mergeAllMarkedReferences(MarkSweepGarbageCollector.java:934)
    at 
org.apache.jackrabbit.oak.plugins.blob.MarkSweepGarbageCollector.sweep(MarkSweepGarbageCollector.java:477)
    at 
org.apache.jackrabbit.oak.plugins.blob.MarkSweepGarbageCollector.markAndSweep(MarkSweepGarbageCollector.java:368)
    at 
org.apache.jackrabbit.oak.plugins.blob.MarkSweepGarbageCollector.collectGarbage(MarkSweepGarbageCollector.java:245)
    at 
com.example.TestOakGarbageCollection.main(TestOakGarbageCollection.java:307){code}
[~amitj] do you have an idea of why that condition is there? To me, it seems 
wrong: even if there are no more blobs referenced, the GC should succeed.


was (Author: JIRAUSER289114):
[~amitj] sorry, my bad, I added the compaction in wrong place in the code - 
fixed.

Actually compaction did the trick, despite the fact that I still add just one 
10MB blob.

However, the GC fails anyway because of this condition:

[https://github.com/apache/jackrabbit-oak/blob/5ac5b177cba9f4e40641d69fed986cd420b6a17b/oak-blob-plugins/src/main/java/org/apache/jackrabbit/oak/plugins/blob/MarkSweepGarbageCollector.java#L933]

That is:
{code:java}
if (!fs.getMarkedRefs().exists() || fs.getMarkedRefs().length() == 0) {{code}
When the GC is called at the of the program, no referenced blobs are present, 
and so `fs.getMarkedRefs().length()` is zero, and thefore an Exception is 
raised:
{code:java}
Exception in thread "main" java.io.IOException: Marked references not available
    at 
org.apache.jackrabbit.oak.plugins.blob.MarkSweepGarbageCollector$GarbageCollectionType.mergeAllMarkedReferences(MarkSweepGarbageCollector.java:934)
    at 
org.apache.jackrabbit.oak.plugins.blob.MarkSweepGarbageCollector.sweep(MarkSweepGarbageCollector.java:477)
    at 
org.apache.jackrabbit.oak.plugins.blob.MarkSweepGarbageCollector.markAndSweep(MarkSweepGarbageCollector.java:368)
    at 
org.apache.jackrabbit.oak.plugins.blob.MarkSweepGarbageCollector.collectGarbage(MarkSweepGarbageCollector.java:245)
    at 
com.example.TestOakGarbageCollection.main(TestOakGarbageCollection.java:307){code}
[~amitj] do you have an idea of why that condition is there? To me, it seems 
wrong: even if there are no more blobs referenced, the GC should succeed.

> Garbage Collection does not remove blobs file from the file system
> ------------------------------------------------------------------
>
>                 Key: OAK-9765
>                 URL: https://issues.apache.org/jira/browse/OAK-9765
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>    Affects Versions: 1.42.0
>            Reporter: Piercarlo Slavazza
>            Priority: Blocker
>
> Using a NodeStore backed by a FileStore, with a blob store of type 
> FileBlobStore:
>  # (having configured GC with estimation {_}disabled{_})
>  # a file is added as a blob
>  # then the node where the blob is references is _removed_
>  # then the GC is run
>  # expected behaviour: the node is no more accessible, _and_ no chunk of the 
> blob is present on the file system
>  # actual behaviour: the node is no more accessible BUT all the chunks are 
> still present on the file system
> Steps to reproduce: execute the (really tiny) main in 
> [https://github.com/PiercarloSlavazza/oak-garbage-collection-test/] 
> (instructions in the readme)



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to