[ 
https://issues.apache.org/jira/browse/OAK-7083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16373222#comment-16373222
 ] 

Matt Ryan commented on OAK-7083:
--------------------------------

Since [https://github.com/apache/jackrabbit-oak/pull/80] entails a change to 
the {{MarkSweepGarbageCollector}}, we should discuss the change here to see if 
there are concerns with it or if there is a better approach.
 
Let me try to briefly explain the change I've proposed to the 
{{MarkSweepGarbageCollector}}.
 
In the use case I tested there are two Oak repositories, one which we will call 
primary and one which we will call secondary.  Primary gets created first; 
secondary is created by cloning the node store of primary, then using a 
{{CompositeDataStore}} to have two delegate data stores.  The first delegate is 
the same as the data store for primary, in read-only mode.  The second delegate 
is only accessible by the secondary repo.
 
Let the data store shared by primary and secondary be called DS_P and the data 
store being used only by the secondary be called DS_S.  DS_P can be read by 
secondary but not modified, so all changes on secondary are saved in DS_S.  
Primary can still make changes to DS_P.
 
Suppose after creating both repositories, records A and B are deleted from the 
primary repo, and records B and C are deleted from the secondary repo.  Since 
DS_P is shared, only blob B should actually be deleted from DS_P via GC.  After 
both repositories run their “mark” phase, the primary repo created a 
“references” file in DS_P excluding A and B, meaning primary thinks A and B can 
both be deleted.  And the secondary repo created a “references” file in DS_P 
excluding B and C, meaning secondary thinks B and C can both be deleted.
 
Suppose then primary runs the sweep phase first.  It will first verify that it 
has a references file for each repository file in DS_P.  Since both primary and 
secondary put one there this test passes.  It will then merge all the data in 
all the references files in DS_P with its own local view of the existing blobs, 
and come up with a set of blobs to delete.  Primary will conclude that blobs B 
and C should be deleted - B because both primary and secondary said it is 
deleted, and C because secondary said it should be deleted and primary has no 
knowledge of C so it will assume it is okay to delete.  At this point primary 
will delete B and try to delete C and fail (which is ok).  Then primary will 
delete its “references” file from DS_P and call the sweep phase complete.
 
Now the problem comes when secondary tries to run the sweep phase.  It will 
first try to verify that a references file exists for each repository file in 
DS_P - and fail.  This fails because primary deleted its references file 
already.  Thus secondary will cancel GC and thus blob C never ends up getting 
deleted.  Note that secondary must delete C because it is the only repository 
that knows about C.
 
This same situation exists also if secondary sweeps first.  If record D was 
created by primary after secondary was cloned, then D is deleted by primary, 
secondary never knows about blob D so it cannot delete it during the sweep 
phase - it can only be deleted by primary.
 
 
The change I made to the garbage collector is that when a repository finishes 
the sweep phase, it doesn’t necessarily delete the references file.  Instead it 
marks the data store with a “sweepComplete” file indicating that this 
repository finished the sweep phase.  When there is a “sweepComplete” file for 
every repository (in other words, the last repository to sweep), then all the 
references files are deleted.
 
I wrote an integration test to test DSGC for this specific composite data store 
use case at 
[https://github.com/mattvryan/jackrabbit-oak/blob/39b33fe94a055ef588791f238eb85734c34062f3/oak-blob-composite/src/test/java/org/apache/jackrabbit/oak/blob/composite/CompositeDataStoreRORWIT.java.]
 
 
All the oak unit tests pass with this change.  I am concerned about any 
unforeseen consequences that others on-list may have about this change.  Also 
there’s the issue that sweeping must now be done by every repository sharing 
the data store, which will have some inefficiencies.  I’m open to changes or to 
a different approach if we can solve the problem described above still.

> CompositeDataStore - ReadOnly/ReadWrite Delegate Support
> --------------------------------------------------------
>
>                 Key: OAK-7083
>                 URL: https://issues.apache.org/jira/browse/OAK-7083
>             Project: Jackrabbit Oak
>          Issue Type: New Feature
>          Components: blob, blob-cloud, blob-cloud-azure, blob-plugins
>            Reporter: Matt Ryan
>            Assignee: Matt Ryan
>            Priority: Major
>
> Support a specific composite data store use case, which is the following:
> * One instance uses no composite data store, but instead is using a single 
> standard Oak data store (e.g. FileDataStore)
> * Another instance is created by snapshotting the first instance node store, 
> and then uses a composite data store to refer to the first instance's data 
> store read-only, and refers to a second data store as a writable data store
> One way this can be used is in creating a test or staging instance from a 
> production instance.  At creation, the test instance will look like 
> production, but any changes made to the test instance do not affect 
> production.  The test instance can be quickly created from production by 
> cloning only the node store, and not requiring a copy of all the data in the 
> data store.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to