Hi,
I very much share Francesco's concerns here. Unconditionally exposing
access to operation system resources underlying Oak's inner working is
troublesome for various reasons:
- who owns the resource? Who coordinates (concurrent) access to it and
how? What are the correctness and performance implications here (races,
deadlock, corruptions, JCR semantics)?
- it limits implementation freedom and hinders further evolution
(chunking, de-duplication, content based addressing, compression, gc,
etc.) for data stores.
- bypassing JCR's security model
Pretty much all of this has been discussed in the scope of
https://issues.apache.org/jira/browse/JCR-3534 and
https://issues.apache.org/jira/browse/OAK-834. So I suggest to review
those discussions before we jump to conclusion.
Also what is the use case requiring such a vast API surface? Can't we
come up with an API that allows the blobs to stay under control of Oak?
If not, this is probably an indication that those blobs shouldn't go
into Oak but just references to it as Francesco already proposed.
Anything else is whether fish nor fowl: you can't have the JCR goodies
but at the same time access underlying resources at will.
Michael
On 5.5.16 11:00 , Francesco Mari wrote:
This proposal introduces a huge leak of abstractions and has deep security
implications.
I guess that the reason for this proposal is that some users of Oak would
like to perform some operations on binaries in a more performant way by
leveraging the way those binaries are stored. If this is the case, I
suggest those users to evaluate an applicative solution implemented on top
of the JCR API.
If a user needs to store some important binary data (files, images, etc.)
in an S3 bucket or on the file system for performance reasons, this
shouldn't affect how Oak handles blobs internally. If some assets are of
special interest for the user, then the user should bypass Oak and take
care of the storage of those assets directly. Oak can be used to store
*references* to those assets, that can be used in user code to manipulate
the assets in his own business logic.
If the scenario I outlined is not what inspired this proposal, I would like
to know more about the reasons why this proposal was brought up. Which
problems are we going to solve with this API? Is there a more concrete use
case that we can use as a driving example?
2016-05-05 10:06 GMT+02:00 Davide Giannella <[email protected]>:
On 04/05/2016 17:37, Ian Boston wrote:
Hi,
If the File or URL is writable, will writing to the location cause issues
for Oak ?
IIRC some Oak DS implementations use a digest of the content to determine
the location in the DS, so changing the content via Oak will change the
location, but changing the content via the File or URL wont. If I didn't
remember correctly, then ignore the concern. Fully supportive of the
approach, as a consumer of Oak. The locations will certainly probably
leak
outside the context of an Oak session so the API contract should make it
clear that the code using a direct location needs to behave responsibly.
It's a reasonable concern and I'm not in the details of the
implementation. It's worth to keep in mind though and remember if we
want to adapt to URL or File that maybe we'll have to come up with some
sort of read-only version of such.
For the File class, IIRC, we could force/use the setReadOnly(),
setWritable() methods. I remember those to be quite expensive in time
though.
Davide