[jira] [Commented] (OAK-4412) Lucene hybrid index

Chetan Mehrotra (JIRA) Wed, 27 Jul 2016 02:51:03 -0700

    [ 
https://issues.apache.org/jira/browse/OAK-4412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15395332#comment-15395332
 ]


Chetan Mehrotra commented on OAK-4412:
--------------------------------------

bq. I assume you mean using the NRT search feature only for the local index and 
not the index that is committed into Oak

Yes

bq. IIRC from experiments in 2011, Lucene's NRT feature is not durable and 
without a separate WAL looses data.

Thats useful info. However in current case we need those Lucene index file for 
the duration when system is running and on restart it would be fine to purge 
it. Would keep this aspect in mind

bq. Also, with the same index on every cluster member, the index gains no 
scalability benefits from the cluster, but that is a bigger issue which would 
need a significant rethink.

Scalability is a different aspect. This issue is focused more to provide a 
substitute for property index and also attempts to provide more recent result 
for changes happening on specific cluster now. For those indexes which require 
scalability (not all index in given system would become large as many would be 
sparse) it would be better to leverage the Solr support


> Lucene hybrid index
> -------------------
>
>                 Key: OAK-4412
>                 URL: https://issues.apache.org/jira/browse/OAK-4412
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: lucene
>            Reporter: Tomek Rękawek
>            Assignee: Tomek Rękawek
>             Fix For: 1.6
>
>         Attachments: OAK-4412.patch
>
>
> When running Oak in a cluster, each write operation is expensive. After 
> performing some stress-tests with a geo-distributed Mongo cluster, we've 
> found out that updating property indexes is a large part of the overall 
> traffic.
> The asynchronous index would be an answer here (as the index update won't be 
> made in the client request thread), but the AEM requires the updates to be 
> visible immediately in order to work properly.
> The idea here is to enhance the existing asynchronous Lucene index with a 
> synchronous, locally-stored counterpart that will persist only the data since 
> the last Lucene background reindexing job.
> The new index can be stored in memory or (if necessary) in MMAPed local 
> files. Once the "main" Lucene index is being updated, the local index will be 
> purged.
> Queries will use an union of results from the {{lucene}} and 
> {{lucene-memory}} indexes.
> The {{lucene-memory}} index, as a local stored entity, will be updated using 
> an observer, so it'll get both local and remote changes.
> The original idea has been suggested by [~chetanm] in the discussion for the 
> OAK-4233.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (OAK-4412) Lucene hybrid index

Reply via email to