[
https://issues.apache.org/jira/browse/OAK-9149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203926#comment-17203926
]
Stefan Egli commented on OAK-9149:
----------------------------------
I'm currently wondering about two things:
* does the current logic work 100% in exception/retry cases
* but assuming the above is fine, what's the granularity of _phases_ required
for grouping updateOps appropriately for batching
Reason for questioning the current logic is:
* Order matters for exception case: IIUC {{backgroundSplit()}}, or more
correctly the
[{{SplitOperations.create()}}|https://github.com/apache/jackrabbit-oak/blob/002c7d9e31ac0160e52213bf2e0d18558b467187/oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/SplitOperations.java#L179]
method has a particular order in which the updateOps are created -
(1)intermediates/(2)stalePrevs/(3)garbage/(4)main - and that order would seem
to be relevant to avoid inconsistencies if one of them fails (as the repository
is left in that half-split situation for a little while until the
{{runBackgroundUpdateOperations()}} retry-logic will kick in - and reading the
repository before the retry should work correctly). I assume this is accounted
for by the order in which updateOps are returned (but I'll have to review this
part again).
* Successful retry logic: once a retry becomes necessary, that retry should
end up creating an appropriately structured split doc tree of course. My
concern is: does {{createIntermediateDocs()}} handle the failure/retry scenario
appropriately (since the repository might have done one splitUpdateOp but not
all). I'll have to look at this code in some more details incl according tests.
But assuming it does work as expected, the assumption IIUC (and that's also
essentially what [~mreutegg] pointed out in the previous comment) is that the
order of executing split updateOps matters. Open question now is to what
granularity it matters - ie if it is enough that the main doc is updated as 1
last, separate step - or if there's also a dependency between updating the
intermediate vs the disconnecting stale prev docs and removing garbage.
Either way it seems necessary for the batch execution to indeed have _phases_ -
whereby updateOps for a particular phase for a set of docs can be
grouped/batched. Once a "phase update batch" is successful, the next phase can
be executed.
* Now if it's fine to have only 2 phases ("(1) the details, (2) then main")
this can eg be handled by assuming the main doc updateOp is always the last one
returned by {{SplitOperations}} - and the batch update just needs to group and
execute all those last updateOps in phase 2.
* If more phases are required though, then {{SplitOperations}} might have to
be adjusted to reflect the required granularity in the returned list explicitly.
[~mreutegg], wdyt?
> Use batch calls in backgroundSplit
> ----------------------------------
>
> Key: OAK-9149
> URL: https://issues.apache.org/jira/browse/OAK-9149
> Project: Jackrabbit Oak
> Issue Type: Improvement
> Components: documentmk
> Affects Versions: 1.32.0
> Reporter: Stefan Egli
> Assignee: Stefan Egli
> Priority: Major
> Fix For: 1.36.0
>
>
> Currently the splitting of documents is done with individual write
> operations. This can significantly slow down the background update rate when
> there are many documents to split. The method
> DocumentNodeStore.backgroundSplit() should collect operations and perform
> them in bulk.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)