[ 
https://issues.apache.org/jira/browse/LUCENE-5438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13903366#comment-13903366
 ] 

Shai Erera commented on LUCENE-5438:
------------------------------------

I reviewed the patch and looks like it can roughly be divided to two:

* Infrastructure: IndexWriter.flushAndIncRef(), 
SIS.write/read(DataOutput/Input), SDR being public...
* Replication: all in the test, in the form of Replica and the various threads

What we need is to figure out how to tie the replication stuff with the 
Replicator API. I thought about it, discussed a bit with Mike, and here's a 
proposal:

* Implement NRTRevision which will list the NRT files (flushed segments) and 
the SIS-as-byte[] that we get from IW.flushAndIncRef().
* Implement NRTIndexUpdateHandler (replica side) to act on the special 
revision, and use the SIS.read() API and SDR().
** We might also need the ReferenceManager that exists in the patch, which acts 
upon a SIS, rather than looking up commit points.

That will get NRT segments appear on the replicas easily, with some caveats:

* The replicas will need to copy over merged segments, which can take a lot of 
time, and hurts the latency.
* The replicas communicate w/ the primary node at their own leisure (well, 
configured interval), and for NRT we might want to notify the replicas of new 
files, to reduce latency.

As for merged segments, the patch includes the foundation for it already in the 
form of MergedSegmentWarmer, which holds on to the merged segments until all 
replicas successfully copied it. That way replicas copy the big segments and 
only after all of them are done, the merged segment is exposed to the SIS on 
the primary side, and a new NRTRevision will list it. There are still issues to 
take care of, such as replicas failing etc., but it's impl details I think.

To handle the push/pull issue we can offer another Replicator implementation - 
PushReplicator - which is given a list of replicas and every .publish() is 
communicated to all replicas so that they can start copying files immediately. 
Every replica will have an associated thread on the primary side to handle the 
replication logic and report on any failures that occurred with the replica. 
This will help w/ the bookkeeping that needs to be done on the primary side.

These are all just preliminary thoughts, I'm sure there are fun gotchas :).

> add near-real-time replication
> ------------------------------
>
>                 Key: LUCENE-5438
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5438
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/replicator
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 5.0, 4.7
>
>         Attachments: LUCENE-5438.patch, LUCENE-5438.patch
>
>
> Lucene's replication module makes it easy to incrementally sync index
> changes from a master index to any number of replicas, and it
> handles/abstracts all the underlying complexity of holding a
> time-expiring snapshot, finding which files need copying, syncing more
> than one index (e.g., taxo + index), etc.
> But today you must first commit on the master, and then again the
> replica's copied files are fsync'd, because the code operates on
> commit points.  But this isn't "technically" necessary, and it mixes
> up durability and fast turnaround time.
> Long ago we added near-real-time readers to Lucene, for the same
> reason: you shouldn't have to commit just to see the new index
> changes.
> I think we should do the same for replication: allow the new segments
> to be copied out to replica(s), and new NRT readers to be opened, to
> fully decouple committing from visibility.  This way apps can then
> separately choose when to replicate (for freshness), and when to
> commit (for durability).
> I think for some apps this could be a compelling alternative to the
> "re-index all documents on each shard" approach that Solr Cloud /
> ElasticSearch implement today, and it may also mean that the
> transaction log can remain external to / above the cluster.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to