[
https://issues.apache.org/jira/browse/OAK-2556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14344768#comment-14344768
]
Thomas Mueller commented on OAK-2556:
-------------------------------------
Making this configurable (if that's easy) is a good idea.
I don't think it's a big problem if the index is inconsistent within a
revision, if this is documented. If enabled, only some nodes of a large
transaction might be visible in a query, but that would only apply if async
indexes are used (typically for ordering, or full-text constraints). The query
would not return nodes that don't match the conditions, but it would not return
_all_ nodes that match; but this is anyway to be expected when using an async
index. If a stronger guarantee is needed (for example a lookup with
propertyName=x), then typically a synchronous index is used.
> do intermediate commit during async indexing
> --------------------------------------------
>
> Key: OAK-2556
> URL: https://issues.apache.org/jira/browse/OAK-2556
> Project: Jackrabbit Oak
> Issue Type: Bug
> Components: oak-lucene
> Affects Versions: 1.0.11
> Reporter: Stefan Egli
>
> A recent issue found at a customer unveils a potential issue with the async
> indexer. Reading the AsyncIndexUpdate.updateIndex it looks like it is doing
> the entire update of the async indexer *in one go*, ie in one commit.
> When there is - for some reason - however, a huge diff that the async indexer
> has to process, the 'one big commit' can become gigantic. There is no limit
> to the size of the commit in fact.
> So the suggestion is to do intermediate commits while the async indexer is
> going on. The reason this is acceptable is the fact that by doing async
> indexing, that index is anyway not 100% up-to-date - so it would not make
> much of a difference if it would commit after every 100 or 1000 changes
> either.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)