Hi, The 2nd case is when the version GC is needed which would delete the older revisions from the node store.
The 3rd case is a bit of a problem. The problem is that DSGC takes a paranoid approach in that if there are no blob references it complains with an exception rather than wipe out the whole datastore (We might have had a problem with the NodeStore collect references). I believe that's the right approach though. For your case the way you can test is to have some references which aren't deleted such that the process will proceed. But If you not just testing but know what such situations will be present in your application then rather than running DSGC you can just list all the blobs ids from the DataStore and delete them directly. Will that work for you? Thanks Amit On Fri, Jul 19, 2019 at 1:13 PM Ruben Lozano <[email protected]> wrote: > Hi, > > I'm still trying to delete that blob when the nodes that reference it > are deleted. > > Searching for delete the last reference to the blob I ended disabling > the node versioning, as I couldn't delete the node history version. > > > I have three cases right now: > > - If I delete the node with versioning, run the version gc and the blob > gc, I ended with a frozenNode reference (protected) that avoids the blob > for being physically deleted by the blob gc. > > - If I delete the node without versioning, avoid to run the version gc > and only run the blob gc, I ended with a reference from a logically > deleted node but the blob gc doesn't delete the blob. > > - If I create the file without versioning, run the version gc and the > blob gc, I ended with a IOException "Marked references not available", > because the blob is not being referenced anywhere. > > > I thought that blob gc should delete unreferenced blobs, not throw an > exception. I'm stuck, because I don't know in what case the gc is going > to delete the blob. > > > Thanks for your help, > > Ruben Lozano > > > El 15/07/2019 a las 14:57, Ruben Lozano escribió: > > Hi again, > > > > First of all, thanks for your answer. > > > > > > After the node deletion, I used the versionGarbageCollector before the > > MarkSweepGarbageCollector as you said, but the blob still is being > > referenced and therefore is not being deleted. > > > > ns.getClock().waitUntil(ns.getClock().getTime() + 1000); > > vGC.gc(0, TimeUnit.MILLISECONDS); > > > > The node, and his children nodes that I added, are being deleted > > properly by the version garbage collector but if I use the > > checkConsistency: > > > > > > Number of valid blob references marked under mark phase of Blob > > garbage collection [2] > > > > Blob garbage collection completed in 22.61 ms (22 ms). Number of blobs > > deleted [0] with max modification time of [2019-07-15 14:33:06.117] > > > > > > Probably I'm missing something, but if the file nodes are being > > deleted shouldn't the blob references being completely deleted? > > > > Thanks for your time, > > > > Ruben Lozano > > > > > > El 11/07/2019 a las 12:36, Amit Jain escribió: > >> Hi, > >> > >> You need to run version GC before doing data store garbage collection > >> (dsgc) and is a pre-requisite for that. > >> > >> You would need to call VersionGarbageCollector#gc to delete older node > >> reversions for dsgc to be effective. Do take a look at the test case > >> which > >> sets up the deleted nodes to be version collected before running > >> dsgc. The > >> version garbage collector uses a max age parameter which should be past > >> before it would collect corresponding nodes. > >> > >> Also, there's a max age parameter for deleting only aged blobs which you > >> have set to 1ms so that should be ok. > >> > >> Thanks > >> Amit > >> > >> [1] > >> > https://github.com/apache/jackrabbit-oak/blob/trunk/oak-store-document/src/test/java/org/apache/jackrabbit/oak/plugins/document/MongoBlobGCTest.java#L152 > >> > >> > >> On Thu, Jul 11, 2019 at 3:33 PM Ruben > >> Lozano<[email protected]> > >> wrote: > >> > >>> Hi, greetings from Spain > >>> > >>> I have been working with oak for a month in a spring boot application > >>> using the oak API to create the content repository service. > >>> > >>> I can upload and download large files, but I have a problem with the > >>> file delete services. > >>> > >>> After invoking the node.remove and the session.save operations, If I > >>> try to get the file, the node is being deleted properly, but in the > >>> blobs collection the file space remains occupied there. > >>> > >>> In order to empty the deleted blob node I have tried to use the > >>> VersionGarbageCollector and the MarkSweepGarbageCollector, but none of > >>> those worked. > >>> > >>> The way I've been calling the MarkSweepGarbageCollector is: > >>> > >>> MarkSweepGarbageCollector gc = new MarkSweepGarbageCollector(new > >>> DocumentBlobReferenceRetriever(documentNodeStore), > >>> (GarbageCollectableBlobStore) documentNodeStore.getBlobStore(), > >>> (ThreadPoolExecutor) Executors.newFixedThreadPool(1), ADMIN, 5, > >>> 1,"mongodb://" + "localhost" + ":" + PORT); > >>> > >>> gc.collectGarbage(false); > >>> > >>> > >>> The collector can find the proper blobs but they're not being deleted: > >>> > >>> Collected (115) blob references > >>> > >>> Number of valid blob references marked under mark phase of Blob garbage > >>> collection [138] > >>> > >>> Number of blobs present in BlobStore : [23] > >>> > >>> Blob garbage collection completed in 56.33 ms (56 ms). Number of blobs > >>> deleted [0] with max modification time of [2019-07-11 10:22:18.875] > >>> > >>> > >>> I'm sure I'm doing something wrong, maybe I need to create a new > >>> session > >>> or mark the blob for deletion somehow. > >>> > >>> > >>> Thanks for your help. > >>> > >>> > >>> > > > >
