[
https://issues.apache.org/jira/browse/OAK-8186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16810934#comment-16810934
]
Matt Ryan commented on OAK-8186:
--------------------------------
I had a discussion with [[email protected]] offline yesterday to work through
some of the questions. To summarize the discussion, what we determined is that
the intent of this proposal is to allow *processing of a binary in a more
efficient means* than streaming the binary through the JVM. We clarified the
following points:
* The proposal applies only to Oak instances using FileDataStore.
* Oak will not provide direct access to any file. The proposal must only be
about access to a copy of the file, created in a temporary location.
* The access is effectively read-only, meaning that Oak will not directly apply
any changes made to the file. If changes are made that the user wishes to
apply, the changed binary must be applied as an update via existing JCR API.
Again, the intent of the proposal is to allow the creation of the temporary
file and third-party access to and processing of the file via more efficient
means than streaming the binary through the JVM and the JCR APIs. See [this
comment|https://issues.apache.org/jira/browse/OAK-8186?focusedCommentId=16808802&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16808802]
for an example use.
I've requested that more detailed testing of the proposal be done to figure out
at what point making a file copy and working directly with the copied file is
more efficient overall than using the existing supported approach. Having more
data should help validate the justification (or not).
Some open questions:
* Who should be responsible to delete the temporary file after use? It seems
to me the client should; the client knows when it is no longer needed. I don't
want to burden Oak with the responsibility to delete the temporary files.
* If we were to implement such a feature, would we limit it to FileDataStore or
also support it for the cloud data stores? The same use case would apply
either way. Clients could of course use the direct download URI for cloud data
stores to make their own temp file, but in theory Oak could also provide a
single API for creating the temp file and for cloud data stores use the direct
download API to make the temp copy.
** Personally I'm less worried about how to do it for cloud data stores and
more worried about whether we should do it at all. The cloud data stores with
direct binary access have the effect of moving a lot of the binary state off of
the Oak instance; creating a temp file seems a step backwards.
Comments? /cc [~mduerig]/[~frm]/[~teofili]
> Create API in OAK for file access to binaries in the repository.
> ----------------------------------------------------------------
>
> Key: OAK-8186
> URL: https://issues.apache.org/jira/browse/OAK-8186
> Project: Jackrabbit Oak
> Issue Type: Improvement
> Reporter: Henry Saginor
> Priority: Major
> Attachments: OAK File Access.jpg
>
>
> To get file access applications normally write binaries to temp files. It
> would be nice if an API existed to get file access directly from OAK. This
> might also meet some use cases documented at
> [https://wiki.apache.org/jackrabbit/JCR%20Binary%20Usecase]
> Suggested API and implementation can be found here [1]. Also, see attached
> diagram [2].
> I can create a patch if I can get some feedback. Note that suggested API
> makes it explicit that a temp file is created. I am not sure if direct access
> to files in datasore would be safe. But I am open to suggestions.
> [1]
>
> [https://github.com/hsaginor/jackrabbit-oak/blob/directFileAccess/oak-api/src/main/java/org/apache/jackrabbit/oak/api/blob/FileReferencable.java]
>
> [https://github.com/hsaginor/jackrabbit-oak/blob/directFileAccess/oak-api/src/main/java/org/apache/jackrabbit/oak/api/blob/TempFileReference.java]
>
> [https://github.com/hsaginor/jackrabbit-oak/blob/directFileAccess/oak-api/src/main/java/org/apache/jackrabbit/oak/api/blob/TempFileReferenceProvider.java]
>
> [https://github.com/hsaginor/jackrabbit-oak/blob/directFileAccess/oak-blob-plugins/src/main/java/org/apache/jackrabbit/oak/plugins/blob/datastore/FileDSBlobTempFileReference.java]
>
> [https://github.com/hsaginor/jackrabbit-oak/blob/directFileAccess/oak-blob-plugins/src/main/java/org/apache/jackrabbit/oak/plugins/blob/datastore/DataStoreBlobStore.java]
>
> [https://github.com/hsaginor/jackrabbit-oak/blob/directFileAccess/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/SegmentBlob.java]
>
> [https://github.com/hsaginor/jackrabbit-oak/blob/directFileAccess/oak-store-spi/src/main/java/org/apache/jackrabbit/oak/plugins/value/jcr/BinaryImpl.java]
> [2]
> !OAK File Access.jpg!
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)