On 15/03/2018 14:51, [email protected] wrote:
I went through this link -
http://blog.mikemccandless.com/2017/09/lucenes-near-real-time-segment-index.html
Lucy doesn't support any of Lucene's replication features.
I was thinking of implementing ; can you suggest what will be best method of
implementing above methodlogy
You could start by simply copying the index directory from the master to the
slaves while locking out access to the index on both master and slaves. Lucy's
index files never change, so you can use something equivalent to `rsync
--ignore-existing`.
Here's an overview of the directory layout:
http://lucy.apache.org/docs/c/Lucy/Docs/FileFormat.html
Ignoring any lock files, the list of files is:
- snapshot_*.json
- schema_*.json
- seg_*/segmeta.json
- seg_*/cfmeta.json
- seg_*/cf.dat
If you want to support concurrent searching on the slaves, things get more
complicated. You should:
- Derive the list of segments to be copied from the latest snapshot
file.
- First copy the new schema and segment files.
- Copy the snapshot file at the end and make sure that it's updated
atomically.
If there are concurrent updates on the master, it can happen that files are
deleted after reading the snapshot file. So you should make sure that there
are no indexing sessions running during the file transfer or acquire Lucy's
deletion lock.
Afterwards you can delete old segments, either by consulting the file list or
by periodically creating an Indexer on the slaves and immediately destroying it.
Nick