Hi Thomas,

On Fri, Feb 24, 2017 at 1:09 PM, Thomas Mueller <[email protected]> wrote:
> 9) Sorting of path is needed, so that the repository can be processed bit
> by bit by bit. For that, the following logic is used, recursively: read at
> most 1000 child nodes. If there are more than 1000, then this subtree is
> never split but processed in one step (so many child nodes can still lead
> to large transactions, unfortunately). If less than 1000 child nodes, then
> the names of all child nodes are read, and processed in sorted order
> (sorted by node name).

This should work! So we can implement a "paginated tree traversal" via
above approach and similar approach can be used for Lucene indexes.
Would be good to record this in OAK-2556 (or better a new issue) and
we can look into implementing it in those parts which do such large
transaction (reindex async index, reindex sync index, content
migration in sidegrade) etc

Chetan Mehrotra

Reply via email to