[ 
https://issues.apache.org/jira/browse/OAK-4233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15256102#comment-15256102
 ] 

Davide Giannella commented on OAK-4233:
---------------------------------------

[~tomek.rekawek]
bq. OK, if everyone agrees that we should implement the Marcel's idea, let's do 
it. It'll help to solve the original issue anyway, as the index changes will be 
saved asynchronously.

+1 

[~chetanm]
bq. If we are switching to async mode then its better to make use of Lucene as 
that provides much compact storage and reduces load on "nodes" collection

+1 for reusing Lucene. I guess we should simply replace the
OakDirectory with something like InMemoryDirectory. It could resolve
in a simple extension. (didn't check the code). I'm slightly concerned
about memory impact and any binary data. Don't know what exactly ends
in the data store from lucene; but should it stay in-memory?

bq. To make the async index more closer to current state (per cluster node) we 
can have another in memory Lucene index which gets updated by the background 
observer based on diff from last checkpoint and current state. The query can 
then be a union of cursors from the two indexes (base persisted index and in 
memory index)

+1 for using observation rather than commit hooks (If I understood
correctly your statement). This should allow us to process only
successful commits. And we should be able to limit it to local events
only.

[~catholicon]
bq. I couldn't come up with a case of duplicate rows

I don't recall QE performing distinct by default on UnionQueryImpl. I
can definitely remember wrong though.



> Property index stored locally
> -----------------------------
>
>                 Key: OAK-4233
>                 URL: https://issues.apache.org/jira/browse/OAK-4233
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: documentmk, query
>            Reporter: Tomek Rękawek
>            Priority: Minor
>             Fix For: 1.6
>
>
> When running Oak in a cluster, each write operation is expensive. After 
> performing some stress-tests with a geo-distributed Mongo cluster, we've 
> found out that updating property indexes is a large part of the overall 
> traffic. Let's try to create a new property-local index, that will save the 
> indexed data locally, without sharing it.
> Assumptions:
> -there's a new {{property-local}} index type for which the data are saved in 
> the local SegmentNodeStore instance created specifically for this purpose,
> -local changes are indexed using a new editor, based on the 
> {{PropertyIndexEditor}},
> -remote changes are extracted from the JournalEntries fetched in the 
> background read operation and indexed as well,
> -the new index type won't support uniqueness restriction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to