Amit Jain commented on OAK-7083:

In order to modify the blob ID within CompositeDataStore, we would have to wait 
until the delegate wrote the record, get the blob ID, modify it, and then 
rewrite the blob with the new ID.  How is that to be done if not by calling 
addRecord() on the delegate in the first place?
This is what I think you have misunderstood. Why do we need to rewrite the blob 
with the new augmented ID? The ID of the blob remains what the DataStore 
delegate wants it to be (currently all have SHA-256 digest of the content as 
the ID). The blob ID returned up the stack to store the the reference in the 
NodeStore (i.e. as a property in the nodes) is to be augmented only with the 
other information. The information currently is the length of the blob and 
which what we are proposing here is the datastore type/location identifier. 
That is why in my previous replies I had identified the piece of code where 
length is suffixed in the ID which could be used as a template.

And even if we did that, how is the modified delegate to be passed to the 
delegate data store?  Since OakFileDataStore doesn't implement TypedDataStore 
we cannot pass anything to addRecord() other than the input stream.  To change 
that would require modifying FileDataStore in Jackrabbit.
As said previously this is not required and nothing changes with the blob being 
stored already.

The approaches you have defined have the mistaken notion that the blob itself 
would have to be modified with the augmented ID which is not the case.

Otherwise, we would move forward with evaluating the PR for acceptance into Oak 
in order that full-scale performance testing can begin on it as soon as 
Maybe I am wrong but what I got out of the discussion was that the for the use 
case being solved we'd define what the performance is at least a benchmark for 
the repository size being targeted. This does not need to be full scale 
performance tests. I would rather have it now so that we know what we are 
supporting and can be documented. Now if you think the augmentation would still 
take time we can move forward with the current patch but with some preliminary 
search results and documentation about the use case including the repository 
size being targeted.

> CompositeDataStore - ReadOnly/ReadWrite Delegate Support
> --------------------------------------------------------
>                 Key: OAK-7083
>                 URL: https://issues.apache.org/jira/browse/OAK-7083
>             Project: Jackrabbit Oak
>          Issue Type: New Feature
>          Components: blob, blob-cloud, blob-cloud-azure, blob-plugins
>            Reporter: Matt Ryan
>            Assignee: Matt Ryan
>            Priority: Major
> Support a specific composite data store use case, which is the following:
> * One instance uses no composite data store, but instead is using a single 
> standard Oak data store (e.g. FileDataStore)
> * Another instance is created by snapshotting the first instance node store, 
> and then uses a composite data store to refer to the first instance's data 
> store read-only, and refers to a second data store as a writable data store
> One way this can be used is in creating a test or staging instance from a 
> production instance.  At creation, the test instance will look like 
> production, but any changes made to the test instance do not affect 
> production.  The test instance can be quickly created from production by 
> cloning only the node store, and not requiring a copy of all the data in the 
> data store.

This message was sent by Atlassian JIRA

Reply via email to