[
https://issues.apache.org/jira/browse/OAK-4412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15307659#comment-15307659
]
Vikas Saurabh commented on OAK-4412:
------------------------------------
Observation won't get a 'immediate visibility' post session.save() even for
local commits.
If immediate visibility of at least the local changes is a hard-requirement, we
might want to do a commit hook based update for local changes and only consume
external events for observation. BUT, that can lead to potential issue with
expected result set due to differing ordering of revision visibility and
indexing e.g:
* T1 -> local change {{rL1}} happens and gets indexed
* T2 -> remote change {{rR2}} is read via background read and put into
observation queue
* T3 -> local change {{rL3}} happens and get indexed
* T4 -> observation event for {{rR2}} is processed and indexed
With this scheduling, the code at T3 could see {{rR2}} when it committed
{{rL3}}. A query between T3 and T4 can be done via the same code expecting
results from {{rL1, rR2, rL3}} but would actually just get {{rL1, rL2}}.
I can't think of a way to synchronize {{rR2}}'s visibility and indexing short
of tying indexing with background read. Also, we might also just document it
and leave it at that - but if we really want to match today's property indices,
we would probably need to resolve this.
> Lucene-memory property index
> ----------------------------
>
> Key: OAK-4412
> URL: https://issues.apache.org/jira/browse/OAK-4412
> Project: Jackrabbit Oak
> Issue Type: Improvement
> Components: lucene
> Reporter: Tomek Rękawek
> Assignee: Tomek Rękawek
> Fix For: 1.6
>
>
> When running Oak in a cluster, each write operation is expensive. After
> performing some stress-tests with a geo-distributed Mongo cluster, we've
> found out that updating property indexes is a large part of the overall
> traffic.
> The asynchronous index would be an answer here (as the index update won't be
> made in the client request thread), but the AEM requires the updates to be
> visible immediately in order to work properly.
> The idea here is to enhance the existing asynchronous Lucene index with a
> synchronous, locally-stored counterpart that will persist only the data since
> the last Lucene background reindexing job.
> The new index can be stored in memory or (if necessary) in MMAPed local
> files. Once the "main" Lucene index is being updated, the local index will be
> purged.
> Queries will use an union of results from the {{lucene}} and
> {{lucene-memory}} indexes.
> The {{lucene-memory}} index, as a local stored entity, will be updated using
> an observer, so it'll get both local and remote changes.
> The original idea has been suggested by [~chetanm] in the discussion for the
> OAK-4233.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)