[ 
https://issues.apache.org/jira/browse/OAK-4412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15320275#comment-15320275
 ] 

Stefan Egli edited comment on OAK-4412 at 6/8/16 9:26 AM:
----------------------------------------------------------

[~tomek.rekawek], on the question if this local index would be updated 
synchronously in the commitEditor or asynchronously via observation: IIUC then 
the worry is that doing it in a commitEditor slows down each commit and doing 
it via observation results in a slightly out-of-date query result. 
What we could look into is to go via observation, thus take the performance 
improvement of commits, but handle the 'out-of-date' aspect explicitly by 
delaying a query if we notice the index is indeed 'behind'. Such a query would 
thus have to wait until the async index update is done.


was (Author: egli):
[~tomek.rekawek], another point re the sync-commitEditor vs async-observation 
handling of updating local indexing and the resulting problem that going via 
async-observation: what could be done is to stick to async (with the advantage 
to not burden commits) but handle the resulting issue that the index would be 
slightly delayed with trying to delay the query (if the index it would use is 
indeed 'behind') until the index is updated. This would move the potential 
performance hit from commit-time to query-time.

> Lucene-memory property index
> ----------------------------
>
>                 Key: OAK-4412
>                 URL: https://issues.apache.org/jira/browse/OAK-4412
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: lucene
>            Reporter: Tomek Rękawek
>            Assignee: Tomek Rękawek
>             Fix For: 1.6
>
>         Attachments: OAK-4412.patch
>
>
> When running Oak in a cluster, each write operation is expensive. After 
> performing some stress-tests with a geo-distributed Mongo cluster, we've 
> found out that updating property indexes is a large part of the overall 
> traffic.
> The asynchronous index would be an answer here (as the index update won't be 
> made in the client request thread), but the AEM requires the updates to be 
> visible immediately in order to work properly.
> The idea here is to enhance the existing asynchronous Lucene index with a 
> synchronous, locally-stored counterpart that will persist only the data since 
> the last Lucene background reindexing job.
> The new index can be stored in memory or (if necessary) in MMAPed local 
> files. Once the "main" Lucene index is being updated, the local index will be 
> purged.
> Queries will use an union of results from the {{lucene}} and 
> {{lucene-memory}} indexes.
> The {{lucene-memory}} index, as a local stored entity, will be updated using 
> an observer, so it'll get both local and remote changes.
> The original idea has been suggested by [~chetanm] in the discussion for the 
> OAK-4233.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to