stefan-egli opened a new pull request #260: URL: https://github.com/apache/jackrabbit-oak/pull/260
2nd iteration of OAK-9149 building on * https://github.com/apache/jackrabbit-oak/pull/243 * and https://github.com/apache/jackrabbit-oak/pull/244 which was redone based on the following review finding: * https://issues.apache.org/jira/browse/OAK-9149?focusedCommentId=17170693&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17170693 This version now uses a _phased_ and _batched_ backgroundSplit. * _phased_ : it combines all updateOp except the one on the main document into the first phase. These updateOps can be executed in 1 batch and a partial execution in case of an error/exception doesn't do any harm. Each individual updateOP will be redone if it needs to. The more important part is that the update on the main document is done in a second step/phase. That is therefore what's now implemented. If this 2nd phase is executed only partially, then again that's no problem as that just means that for some the split didn't finish properly and needs to be redone. ** Note that if one of the phases is only executed partially, while it is fine from a consistency point of view, it still has the potential to leave garbage. A comment about this is added in a `// TODO` in `backgroundSplit()`. * _batched_ : as mentioned above, all phase 1 updateOps are combined up to the configured batch size (`oak.documentMK.createOrUpdateBatchSize`, default 1000) and executed in one go towards the DocumentStore (ie using the bulk version of the 'createOrUpdate' command). ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
