Hi,

I'm still trying to delete that blob when the nodes that reference it are deleted.

Searching for delete the last reference to the blob I ended disabling the node versioning, as I couldn't delete the node history version.


I have three cases right now:

- If I delete the node with versioning, run the version gc and the blob gc, I ended with a frozenNode reference (protected) that avoids the blob for being physically deleted by the blob gc.

- If I delete the node without versioning, avoid to run the version gc and only run the blob gc, I ended with a reference from a logically deleted node but the blob gc doesn't delete the blob.

- If I create the file without versioning, run the version gc and the blob gc, I ended with a IOException "Marked references not available", because the blob is not being referenced anywhere.


I thought that blob gc should delete unreferenced blobs, not throw an exception.  I'm stuck, because I don't know in what case the gc is going to delete the blob.


Thanks for your help,

Ruben Lozano


El 15/07/2019 a las 14:57, Ruben Lozano escribió:
Hi again,

First of all, thanks for your answer.


After the node deletion, I used the versionGarbageCollector before the MarkSweepGarbageCollector as you said, but the blob still is being referenced and therefore is not being deleted.

ns.getClock().waitUntil(ns.getClock().getTime() + 1000);
vGC.gc(0, TimeUnit.MILLISECONDS);

The node, and his children nodes that I added, are being deleted properly by the version garbage collector but if I use the checkConsistency:


Number of valid blob references marked under mark phase of Blob garbage collection [2]

Blob garbage collection completed in 22.61 ms (22 ms). Number of blobs deleted [0] with max modification time of [2019-07-15 14:33:06.117]


Probably I'm missing something, but if the file nodes are being deleted shouldn't the blob references being completely deleted?

Thanks for your time,

Ruben Lozano


El 11/07/2019 a las 12:36, Amit Jain escribió:
Hi,

You need to run version GC before doing data store garbage collection
(dsgc) and is a pre-requisite for that.

You would need to call VersionGarbageCollector#gc to delete older node
reversions for dsgc to be effective. Do take a look at the test case which sets up the deleted nodes to be version collected before running dsgc. The
version garbage collector uses a max age parameter which should be past
before it would collect corresponding nodes.

Also, there's a max age parameter for deleting only aged blobs which you
have set to 1ms so that should be ok.

Thanks
Amit

[1]
https://github.com/apache/jackrabbit-oak/blob/trunk/oak-store-document/src/test/java/org/apache/jackrabbit/oak/plugins/document/MongoBlobGCTest.java#L152

On Thu, Jul 11, 2019 at 3:33 PM Ruben Lozano<[email protected]>
wrote:

Hi, greetings from Spain

I have been working with oak for a month in a spring boot application
using the oak API to create the content repository service.

I can upload and download large files, but I have a problem with the
file delete services.

After invoking the node.remove and the session.save operations,  If I
try to get the file, the node is being deleted properly, but in the
blobs collection the file space remains occupied there.

In order to empty the deleted blob node I have tried to use the
VersionGarbageCollector and the MarkSweepGarbageCollector, but none of
those worked.

The way I've been calling the MarkSweepGarbageCollector is:

MarkSweepGarbageCollector gc = new MarkSweepGarbageCollector(new
DocumentBlobReferenceRetriever(documentNodeStore),
(GarbageCollectableBlobStore) documentNodeStore.getBlobStore(),
(ThreadPoolExecutor) Executors.newFixedThreadPool(1), ADMIN, 5,
1,"mongodb://" + "localhost" + ":" + PORT);

gc.collectGarbage(false);


The collector can find the proper blobs but they're not being deleted:

Collected (115) blob references

Number of valid blob references marked under mark phase of Blob garbage
collection [138]

Number of blobs present in BlobStore : [23]

Blob garbage collection completed in 56.33 ms (56 ms). Number of blobs
deleted [0] with max modification time of [2019-07-11 10:22:18.875]


I'm sure I'm doing something wrong, maybe I need to create a new session
or mark the blob for deletion somehow.


Thanks for your help.





Reply via email to