[
https://issues.apache.org/jira/browse/OAK-5324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15837941#comment-15837941
]
Thomas Mueller edited comment on OAK-5324 at 1/26/17 9:37 AM:
--------------------------------------------------------------
> But I assume this issue is rather about a way to introduce a new index or
> update an existing one when the system is online, right? In that case, the
> branch-less mode is off the table.
I see. I wrote a tool that allows managing indexes (creating, changing,
reindexing, removing) using a script, for both the regular and the branch-less
mode now:
http://svn.apache.org/r1780222
> At least for new indexes we could try to improve the branch handling in the
> DocumentNodeStore.
If that turns out to be much easier, we could probably make reindexing a
special case of creating a new index. For example, re-index into a new hidden
child node, ":data_1", ":data_2",..., so that the existing nodes are not
changed. And only change the pointer to the latest ":data_x" node at the end,
maybe in a separate commit. After that, the old, outdated ":data_(n-1)" node
could be removed step-by-step using multiple commits, or in one commit (which
can't conflict).
Another options might be to split indexing into multiple commits. For example
use a "fromPath" .. "toPath" range, and only re-index part of the repository at
a time.
> Async re-index? Does that disable synchronous index updates while it is
> re-indexing?
I don't know currently.
was (Author: tmueller):
> But I assume this issue is rather about a way to introduce a new index or
> update an existing one when the system is online, right? In that case, the
> branch-less mode is off the table.
I see. I wrote a tool that allows managing indexes (creating, changing,
reindexing, removing) using a script, for both the regular and the branch-less
mode now:
http://svn.apache.org/r1780222
> At least for new indexes we could try to improve the branch handling in the
> DocumentNodeStore.
If that turns out to be much easier, we could probably make reindexing a
special case of creating a new index. For example, re-index into a new hidden
child node, ":data_1", ":data_2",..., so that the existing nodes are not
changed. And only change the pointer to the latest ":data_x" node at the very
end, in a separate commit.
Another options might be to split indexing into multiple commits. For example
use a "fromPath" .. "toPath" range, and only re-index part of the repository at
a time.
> Async re-index? Does that disable synchronous index updates while it is
> re-indexing?
I don't know currently.
> Enable property index reindexing via oak-run
> --------------------------------------------
>
> Key: OAK-5324
> URL: https://issues.apache.org/jira/browse/OAK-5324
> Project: Jackrabbit Oak
> Issue Type: New Feature
> Components: documentmk, run
> Reporter: Chetan Mehrotra
> Assignee: Thomas Mueller
> Fix For: 1.6, 1.8
>
>
> Currently introducing a new property index or performing a reindex of
> existing property index is problamatic on DocumentNodeStore. This happens
> because doing this results in either
> # Persisted branch - Which is slow at times and has issues related to
> conflict handling
> # Large in memory branch which increases heap pressure
> To enable this use case we should add some tooling in oak-run where we can
> use different approach for achieving the same.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)