[
https://issues.apache.org/jira/browse/OAK-6659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Davide Giannella updated OAK-6659:
----------------------------------
Fix Version/s: 1.8.0
> Cold standby should fail loudly when a big blob can't be timely transferred
> ---------------------------------------------------------------------------
>
> Key: OAK-6659
> URL: https://issues.apache.org/jira/browse/OAK-6659
> Project: Jackrabbit Oak
> Issue Type: Bug
> Components: segment-tar, tarmk-standby
> Affects Versions: 1.7.6
> Reporter: Andrei Dulceanu
> Assignee: Andrei Dulceanu
> Priority: Critical
> Labels: cold-standby
> Fix For: 1.7.8, 1.8.0
>
> Attachments: OAK-6659.patch
>
>
> Due to changes done in OAK-4969, currently there are two 'sync blob' cycles
> triggered by {{StandbyDiff#childNodeChanged}}. The test scenario is the same
> as the one in {{DataStoreTestBase#testSyncBigBlob}}: on the primary file
> store, a new big blob (1GB) is added and then a standby sync is triggered to
> sync this content to the secondary file store.
> The first 'sync blob' cycle happens as a result of {{#process}} being called
> in {{StandbyDiff#childNodeChanged}}. Therefore, a new 'get blob' request is
> created on the client and the server starts sending chunks from the big blob.
> Now, if the time needed for transferring the entire blob from server to
> client exceeds {{readTimeoutMs}} an {{IllegalStateException}} will be
> correctly thrown by {{StandbyDiff#readBlob}}, but will be swallowed by the
> {{StandbyDiff#childNodeChanged}} in its catch clause. A second 'sync blob'
> cycle will be triggered and, -this might succeed with the same
> {{readTimeoutMs}} for which it was failing before-, if {{readTimeoutMs * 2}}
> is enough, the blob will be synced on the standby. This happens because the
> server will continue sending the remaining chunks after
> {{IllegalStateException}} was thrown (first 'sync blob' cycle).
> The consequence of these two 'sync blob' cycles is that sometimes, deleting
> the temporary file to which chunks are spooled to on the client fails (see
> Windows for example and OAK-6641 specifically). This way, instead of deleting
> the previous incomplete transfer, new chunks from the second 'sync blob'
> cycle are added. The blob persisted in the blob store on the client won't
> have the same size and id as the initial blob sent by the server.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)