[ 
https://issues.apache.org/jira/browse/OAK-8275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836699#comment-16836699
 ] 

Henry Saginor edited comment on OAK-8275 at 5/9/19 9:35 PM:
------------------------------------------------------------

See the specification of the reset method 
https://docs.oracle.com/javase/7/docs/api/java/io/InputStream.html#reset()
Specifically this - "If the method mark has not been called since the stream 
was created, or the number of bytes read from the stream since mark was last 
called is larger than the argument to mark at that last call, then an 
IOException might be thrown."

Basically the way I read this is that if you want to set the position to the 
end of the stream and then reset it to starting position (something commons 
compress library actually does) then you need to call mark method when you 
start processing with readlimit argument equal to blob's size+1. Otherwise you 
will get an IOException when you call reset because marked position will be 
invalidated. Since mark method takes an int this would be limited to 
Integer.MAX (or blobs of little over 2gb I think). And in practical sense 
realimit argument should probably be a lot smaller that this to avoid memory 
issues. In my code I have set max readlimit to 1gb which is probably still too 
large. But we can discuss this if/when the general approach is accepted. 

So, this is the reason why the InputStream wrapper would be limited in terms of 
blob size it can support. I hope this clarifies it.



was (Author: [email protected]):
See the specification of the reset method 
https://docs.oracle.com/javase/7/docs/api/java/io/InputStream.html#reset()
Specifically this - "If the method mark has not been called since the stream 
was created, or the number of bytes read from the stream since mark was last 
called is larger than the argument to mark at that last call, then an 
IOException might be thrown."

Basically the way I read this is that if you want to set the position to the 
end of the stream and then reset it to starting position (something commons 
compress library actually does) then you need to call mark method when you 
start processing with realimit argument equal to blob's size+1. Otherwise you 
will get an IOException when you call reset because marked position will be 
invalidated. Since mark method takes an int this would be limited to 
Integer.MAX (or blobs of little over 2gb I think). And in practical sense 
realimit argument should probably be a lot smaller that this to avoid memory 
issues. In my code I have set max readlimit to 1gb which is probably still too 
large. But we can discuss this if/when the general approach is accepted. 

So, this is the reason why the InputStream wrapper would be limited in terms of 
blob size it can support. I hope this clarifies it.


> Add NIO channel access to JCR binaries
> --------------------------------------
>
>                 Key: OAK-8275
>                 URL: https://issues.apache.org/jira/browse/OAK-8275
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>            Reporter: Henry Saginor
>            Priority: Major
>
> This is a follow up to the discussion started in OAK-8186. Currently JCR 
> binaries can only be accessed via InputStream. This is inefficient. It can 
> also be inadequate for some use cases. For example handling some Zip file 
> formats like deflate64 requires random access.
> The proposal is to add API that returns SeekableByteChannel
> Here is the new API I am proposing -
>  
> [https://github.com/hsaginor/jackrabbit/blob/createChannel/jackrabbit-api/src/main/java/org/apache/jackrabbit/api/ChannelBinary.java]
>  
> [https://github.com/hsaginor/jackrabbit-oak/blob/createChannel2/oak-api/src/main/java/org/apache/jackrabbit/oak/api/Blob.java]
>  (see 2 added methods)
> And all of the implementation changes -
>  
> [https://github.com/apache/jackrabbit-oak/compare/trunk...hsaginor:createChannel2]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to