[ 
https://issues.apache.org/jira/browse/OAK-5902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16118474#comment-16118474
 ] 

Andrei Dulceanu edited comment on OAK-5902 at 8/8/17 3:41 PM:
--------------------------------------------------------------

In the end I chose to implement custom chunking logic on both client and server 
because:

* {{LengthFieldFrameDecoder}} could be still used on the client => no need to 
make changes in encoding/decoding for the rest of the packets sent between 
client and server (head state, segments, references, etc.)
* Using {{ChunkedStream}} and {{ChunkedWriteHandler}} on the server needed to 
alter the whole decoding on the client (removing {{LengthFieldFrameDecoder}}, 
see above), implementing a state machine, etc. Basically, before this change 
every packet had a header and a payload. After implementing the change all 
packets except blobs stay as they are, while blobs have only one header for the 
whole blob, followed by {{NO_CHUNKS}} chunks of data without any header.
* Computing the hash of the {{InputStream}} on the fly with 
{{HashingInputStream}} (from guava) means that under the hood, this class 
buffers all the bytes in an internal stream -> e.g. lots of memory consumed on 
the server side which goes against the whole chunking concept (on both client 
and server).
* It prevents transferring chunks that are too little, as the chunks size in 
{{ChunkedStream}} is computed as {{Math.min(this.chunkSize, in.available())}}. 
In practice this means at most 30K per chunk.

All changes can be tracked in [0]. I'm a bit worried about the additional 
{{-Xmx3584m}} heap size parameter added in {{pom.xml}} for 
{{DataStoreTestBase.testSyncBigBlob}}. This test verifies the sync of a 2.5 GB 
blob and without the increased heap size it fails. As a side note, having this 
additional test increases the duration of {{ExternalPrivateStoreIT}} from ~30 
seconds to ~ 180 seconds on my machine. But I guess we can decide later if we 
want to keep it or not.

One more thing: although the actual data sent (i.e. blobId) in every chunk can 
be reduced, I chose not to do so because:
* we are consistent with the previous design
* using an {{Attribute}} set on the channel in {{GetBlobRequestEncoder}} which 
is later retrieved in {{ResponseDecoder}} breaks testability (for unit tests) 
and also transparency is lost. Moreover, no need for additional tricks for 
reconstructing the blobId on the client when logging transfer progress.
* blobId is just a few bytes per chunk, while a chunk can be as big as 2.2 GB.

[~frm], could you take a look at the proposed changes?

[0] https://github.com/dulceanu/jackrabbit-oak/tree/cold-standby/chunking


was (Author: dulceanu):
In the end I chose to implement custom chunking logic on both client and server 
because:

* {{LengthFieldFrameDecoder}} could be still used on the client => no need to 
make changes in encoding/decoding for the rest of the packets sent between 
client and server (head state, segments, references, etc.)
* Using {{ChunkedStream}} and {{ChunkedWriteHandler}} on the server needed to 
alter the whole decoding on the client (removing {{LengthFieldFrameDecoder}}, 
see above), implementing a state machine, etc. Basically, before this change 
every packet had a header and a payload. After implementing the change all 
packets except blobs stay as they are, while blobs have only one header for the 
whole blob, followed by {{NO_CHUNKS}} chunks of data without any header.
* Computing the hash of the {{InputStream}} on the fly with 
{{HashingInputStream}} (from guava) means that under the hood, this class 
buffers all the bytes in an internal stream -> e.g. lots of memory consumed on 
the server side which goes against the whole chunking concept (on both client 
and server).
* It prevents transferring chunks that are too little, as the chunks size in 
{{ChunkedStream}} is computed as {{Math.min(this.chunkSize, in.available())}}. 
In practice this means at most 30K per chunk.

All changes can be tracked in [0]. I'm a bit worried about the additional 
{{-Xmx3584m}} heap size parameter added in {{pom.xml}} for 
{{DataStoreTestBase.testSyncBigBlob}}. This test verifies the sync of a 2.5 GB 
blob and without the increased heap size it fails. As a side note, having this 
additional test increases the duration of {{ExternalPrivateStoreIT}} from ~30 
seconds to ~ 180 seconds on my machine. But I guess we can decide later if we 
want to keep it or not.

[~frm], could you take a look at the proposed changes?

[0] https://github.com/dulceanu/jackrabbit-oak/tree/cold-standby/chunking

> Cold standby should allow syncing of blobs bigger than 2.2 GB
> -------------------------------------------------------------
>
>                 Key: OAK-5902
>                 URL: https://issues.apache.org/jira/browse/OAK-5902
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: segment-tar
>    Affects Versions: 1.6.1
>            Reporter: Andrei Dulceanu
>            Assignee: Andrei Dulceanu
>            Priority: Minor
>             Fix For: 1.8, 1.7.6
>
>
> Currently there is a limitation for the maximum binary size (in bytes) to be 
> synced between primary and standby instances. This matches 
> {{Integer.MAX_VALUE}} (2,147,483,647) bytes and no binaries bigger than this 
> limit can be synced between the instances.
> Per comment at [1], the current protocol needs to be changed to allow sending 
> of binaries in chunks, to surpass this limitation.
> [1] 
> https://github.com/apache/jackrabbit-oak/blob/1.6/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/standby/client/StandbyClient.java#L125



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to