Re: Is there a way to customize segment names?

Patrick Zhai Thu, 15 Dec 2022 23:48:17 -0800

Hi Robert,

Maybe I didn't explain it clearly but we're not going to constantly switch
between writers or share effort between writers, it's purely for
availability: the second writer only kicks in when the first writer is not
available for some reason.
And as far as I know the replicator/nrt module has not provided a solution
on when the primary node (main indexer) is down, how would we recover with
a back up indexer?


Thanks
Patrick


On Thu, Dec 15, 2022 at 7:16 PM Robert Muir <rcm...@gmail.com> wrote:

> This multiple-writer isn't going to work and customizing names won't
> allow it anyway. Each file also contains a unique identifier tied to
> its commit so that we know everything is intact.
>
> I would look at the segment replication in lucene/replicator and not
> try to play games with files and mixing multiple writers.
>
> On Thu, Dec 15, 2022 at 5:45 PM Patrick Zhai <zhai7...@gmail.com> wrote:
> >
> > Hi Folks,
> >
> > We're trying to build a search architecture using segment replication
> (indexer and searcher are separated and indexer shipping new segments to
> searchers) right now and one of the problems we're facing is: for
> availability reason we need to have multiple indexers running, and when the
> searcher is switching from consuming one indexer to another, there are
> chances where the segment names collide with each other (because segment
> names are count based) and the searcher have to reload the whole index.
> > To avoid that we're looking for a way to name the segments so that
> Lucene is able to tell the difference and load only the difference (by
> calling `openIfChanged`). I've checked the IndexWriter and the
> DocumentsWriter and it seems it is controlled by a private final method
> `newSegmentName()` so likely not possible there. So I wonder whether
> there's any other ways people are aware of that can help control the
> segment names?
> >
> > A example of the situation described above:
> > Searcher previously consuming from indexer 1, and have following
> segments: _1, _2, _3, _4
> > Indexer 2 previously sync'd from indexer 1, sharing the first 3
> segments, and produced its own 4th segments (notioned as _4', but it shares
> the same "_4" name): _1, _2, _3, _4'
> > Suddenly Indexer 1 dies and searcher switched from Indexer 1 to Indexer
> 2, then when it finished downloading the segments and trying to refresh the
> reader, it will likely hit the exception here, and seems all we can do
> right now is to reload the whole index and that could be potentially a high
> cost.
> >
> > Sorry for the long email and thank you in advance for any replies!
> >
> > Best
> > Patrick
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

Re: Is there a way to customize segment names?

Reply via email to