[ 
https://issues.apache.org/jira/browse/OAK-7091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Ryan updated OAK-7091:
---------------------------
    Description: 
When adding a new record to an Oak instance that is using composite data store, 
the blob stream will be read twice before it is stored - once by the composite 
data store (to determine the blob ID) and again by the delegate.  This 
necessary because if there are multiple writable delegates and one delegate 
already has a matching blob, the composite should call {{addRecord()}} on the 
delegate that has the matching blob, which may not be the highest priority 
delegate.  So we need to know the blob ID in order to select the correct 
writable delegate.

We could add a method to the CompositeDataStoreAware interface wherein the data 
store can be told which blob ID to use (from the composite) so that it doesn't 
have to process the stream again.  Then the composite data store, after having 
read the stream to a temporary file, can pass an input stream from the 
temporary file to the delegate along with the computed blob ID, to avoid 
reading the stream twice.

  was:When adding a new record to an Oak instance that is using composite data 
store, the blob stream will be read twice before it is stored - once by the 
composite data store (to determine the blob ID) and again by the delegate.  We 
could add a method to the CompositeDataStoreAware interface wherein the data 
store can be told which blob ID to use (from the composite) so that it doesn't 
have to process the stream again.  Then the composite data store, after having 
read the stream to a temporary file, can pass an input stream from the 
temporary file to the delegate along with the computed blob ID, to avoid 
reading the stream twice.


> Avoid streaming data twice in composite data store
> --------------------------------------------------
>
>                 Key: OAK-7091
>                 URL: https://issues.apache.org/jira/browse/OAK-7091
>             Project: Jackrabbit Oak
>          Issue Type: Technical task
>          Components: blob, blob-cloud, blob-cloud-azure, blob-plugins
>            Reporter: Matt Ryan
>            Assignee: Matt Ryan
>
> When adding a new record to an Oak instance that is using composite data 
> store, the blob stream will be read twice before it is stored - once by the 
> composite data store (to determine the blob ID) and again by the delegate.  
> This necessary because if there are multiple writable delegates and one 
> delegate already has a matching blob, the composite should call 
> {{addRecord()}} on the delegate that has the matching blob, which may not be 
> the highest priority delegate.  So we need to know the blob ID in order to 
> select the correct writable delegate.
> We could add a method to the CompositeDataStoreAware interface wherein the 
> data store can be told which blob ID to use (from the composite) so that it 
> doesn't have to process the stream again.  Then the composite data store, 
> after having read the stream to a temporary file, can pass an input stream 
> from the temporary file to the delegate along with the computed blob ID, to 
> avoid reading the stream twice.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to