Hi Robert, Maybe I didn't explain it clearly but we're not going to constantly switch between writers or share effort between writers, it's purely for availability: the second writer only kicks in when the first writer is not available for some reason. And as far as I know the replicator/nrt module has not provided a solution on when the primary node (main indexer) is down, how would we recover with a back up indexer?
Thanks Patrick On Thu, Dec 15, 2022 at 7:16 PM Robert Muir <rcm...@gmail.com> wrote: > This multiple-writer isn't going to work and customizing names won't > allow it anyway. Each file also contains a unique identifier tied to > its commit so that we know everything is intact. > > I would look at the segment replication in lucene/replicator and not > try to play games with files and mixing multiple writers. > > On Thu, Dec 15, 2022 at 5:45 PM Patrick Zhai <zhai7...@gmail.com> wrote: > > > > Hi Folks, > > > > We're trying to build a search architecture using segment replication > (indexer and searcher are separated and indexer shipping new segments to > searchers) right now and one of the problems we're facing is: for > availability reason we need to have multiple indexers running, and when the > searcher is switching from consuming one indexer to another, there are > chances where the segment names collide with each other (because segment > names are count based) and the searcher have to reload the whole index. > > To avoid that we're looking for a way to name the segments so that > Lucene is able to tell the difference and load only the difference (by > calling `openIfChanged`). I've checked the IndexWriter and the > DocumentsWriter and it seems it is controlled by a private final method > `newSegmentName()` so likely not possible there. So I wonder whether > there's any other ways people are aware of that can help control the > segment names? > > > > A example of the situation described above: > > Searcher previously consuming from indexer 1, and have following > segments: _1, _2, _3, _4 > > Indexer 2 previously sync'd from indexer 1, sharing the first 3 > segments, and produced its own 4th segments (notioned as _4', but it shares > the same "_4" name): _1, _2, _3, _4' > > Suddenly Indexer 1 dies and searcher switched from Indexer 1 to Indexer > 2, then when it finished downloading the segments and trying to refresh the > reader, it will likely hit the exception here, and seems all we can do > right now is to reload the whole index and that could be potentially a high > cost. > > > > Sorry for the long email and thank you in advance for any replies! > > > > Best > > Patrick > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > >