Matt Ryan commented on OAK-7083:

Thanks [~amjain] for your comments and review so far.

Since there are a lot of questions I'm going to try to distill down to what I 
think are key issues and then work through the dependent issues as they come.

Let's consider first the proposals to handle garbage collection for composite 
data stores.  I think there are three currently.  For reference, my original 
proposal is:  Change the MarkSweepGarbageCollector so we don't remove any 
"references" files from the metadata area until all repositories connected to a 
data store have attempted the sweep phase.  I think the three proposals are:
 # Move forward with the change I proposed.
 # Require that every repository complete the "mark" phase before any 
repository can attempt a "sweep" phase.
 # Use my proposal but only for repositories using CompositeDataStore.

h3. Proposal 1

I believe the concern with proposal 1 is that production repositories sharing 
the same data store may run GC on completely different schedules.  We can't be 
sure that all repositories complete a mark phase before any repository attempts 
a sweep phase.  In the context of my proposal, I believe what this means is 
that blobs that should be deleted may take longer to delete than expected - for 
example, it may require a couple of invocations.

In the normal shared data store use case I think the impact is that all of the 
connected repositories will try to run the sweep phase.  The same blobs will be 
deleted by the first sweeper as would have been deleted before.  It doesn't 
impact the ability to collect garbage, but may impact efficiency or give 
confusing log messages (which might be fixable).

In the composite data store use case since either repository may have the 
ability to delete blobs that the other repository cannot delete this may mean 
that it takes multiple cycles to do this.  For example, assuming a production 
and staging system, if the staging system deletes a node with a blob reference, 
and then runs mark and then sweep, the sweep may fail since the production 
hasn't done the mark phase yet (no "references" file from production repo).  
Later, the production system would mark and then sweep, deleting the blobs but 
unable to delete blobs on the staging side.  However, with my change the 
"references" files remain, so the next time the staging system runs mark and 
sweep it will be able to sweep since all the "references" files are still 
there, and then it will delete the blob that became unreferenced before.

So eventually I think blobs that should be collected will end up collected 
although it may take a while.
h3. Proposal 2

If we require that every repository complete the "mark" phase before any 
repository can attempt a "sweep" phase, it won't eliminate the need for every 
repository to perform the sweep.  This is still needed because each repository 
has binaries that only can be deleted by that repository.

What it could do is hopefully coordinate the sweep phases so not so much time 
elapses as in proposal 1.

However, I think you still have to answer the question, what does a repository 
do if it is ready to sweep but not all repositories have completed the mark 
phase?  This is almost what we have now.  If not every repository has completed 
the mark phase, and one repository wants to sweep, what happens?  I assume it 
just cancels the sweep until the next scheduled GC time.  In which case I don't 
see how this is any better than proposal 1.
h3. Proposal 3

This proposal is to only use my GC changes with CompositeDataStore.  I'm not 
sure exactly what we mean by this.

We could say that it is only used in repositories that are using a 
CompositeDataStore.  This could be done, although it would probably require 
changing the node store code so that it obtains the garbage collector from a 
registered reference instead of instantiating it directly, and then having the 
different data stores register a garbage collector for use by the node store.  
It might complicate the dependency tree and other things depending on how the 
garbage collector becomes available to the node store (see the 
SegmentNodeStoreService code where the MarkSweepGarbageCollector is 
instantiated to see what I mean).

But it doesn't matter because this approach won't actually solve the problem, 
in my view.  The reason is that *both* of the systems participating have to use 
the same garbage collection algorithm.  In other words, if staging has the 
CompositeDataStore, it is going to rely upon the production system to write the 
"sweepComplete" metadata file and leave the "references" files in order for the 
staging system to successfully complete the "sweep" phase.  The production 
system isn't using CompositeDataStore, though, so if it is relying on having a 
running CompositeDataStore or something to get the different GC algorithm that 
won't work.

So the one that would work would be to simply come up with a new garbage 
collector class that must be configured to be used for any system that is using 
*or coordinating with* a CompositeDataStore.

In this case we could avoid changing the way GC works on standard shared data 
store systems, but it would require that existing systems coordinating with one 
using a CompositeDataStore be configured differently to do so, which feels bad 
to me.  It seems like it would be better if the other systems don't have to be 
configured a certain way based on the configuration of another participant in 
the system (tight coupling issue).


[~amjain] do you feel I've covered the proposals accurately?  Once we think we 
are on the same page we can dig into them and figure out how to resolve the 

Any other proposals?

> CompositeDataStore - ReadOnly/ReadWrite Delegate Support
> --------------------------------------------------------
>                 Key: OAK-7083
>                 URL: https://issues.apache.org/jira/browse/OAK-7083
>             Project: Jackrabbit Oak
>          Issue Type: New Feature
>          Components: blob, blob-cloud, blob-cloud-azure, blob-plugins
>            Reporter: Matt Ryan
>            Assignee: Matt Ryan
>            Priority: Major
> Support a specific composite data store use case, which is the following:
> * One instance uses no composite data store, but instead is using a single 
> standard Oak data store (e.g. FileDataStore)
> * Another instance is created by snapshotting the first instance node store, 
> and then uses a composite data store to refer to the first instance's data 
> store read-only, and refers to a second data store as a writable data store
> One way this can be used is in creating a test or staging instance from a 
> production instance.  At creation, the test instance will look like 
> production, but any changes made to the test instance do not affect 
> production.  The test instance can be quickly created from production by 
> cloning only the node store, and not requiring a copy of all the data in the 
> data store.

This message was sent by Atlassian JIRA

Reply via email to