Maybe ask this issue on solr-dev then? I'm not familiar with how that collector works. Does it count hits across all segments? only within a single segment?
On Tue, May 9, 2023 at 1:36 PM Wei <weiwan...@gmail.com> wrote: > > Hi Michael, > > I am applying early termination with Solr's EarlyTerminatingCollector > https://github.com/apache/solr/blob/d9ddba3ac51ece953d762c796f62730e27629966/solr/core/src/java/org/apache/solr/search/EarlyTerminatingCollector.java > , > which triggers EarlyTerminatingCollectorException in SolrIndexSearcher > https://github.com/apache/solr/blob/d9ddba3ac51ece953d762c796f62730e27629966/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#L281 > > Thanks, > Wei > > > On Thu, May 4, 2023 at 11:47 AM Michael Sokolov <msoko...@gmail.com> wrote: > > > Yes, sorry I didn't mean to imply you couldn't control this if you > > want to. I guess in the typical setup it is not predictable. How are > > you applying early termination? Are you using a standard Lucene > > Collector or do you have your own? > > > > On Thu, May 4, 2023 at 2:03 PM Patrick Zhai <zhai7...@gmail.com> wrote: > > > > > > Hi Mike, > > > Just want to mention if the user chooses to use single thread to index > > and > > > use LogXXMergePolicy then the document order will be preserved as index > > > order. > > > > > > > > > > > > On Thu, May 4, 2023 at 10:04 AM Wei <weiwan...@gmail.com> wrote: > > > > > > > Hi Michael, > > > > > > > > We are interested in the segment sequence for early termination. In our > > > > case there is always a large dominant segment after index rebuild, > > then > > > > many small segments are generated with continuous updates as time goes > > by. > > > > When early termination is applied, the limit could be reached just for > > > > traversing the dominant segment alone and the newer smaller segments > > > > doesn't get a chance. If we can control the segment sequence so that > > the > > > > newer segments are visited first, the documents with recent updates > > can be > > > > retrieved with early termination. Do you think this makes sense? Any > > > > suggestion is appreciated. > > > > > > > > Thanks, > > > > Wei > > > > > > > > On Thu, May 4, 2023 at 3:33 AM Michael Sokolov <msoko...@gmail.com> > > wrote: > > > > > > > > > There is no meaning to the sequence. The segments are created > > > > concurrently > > > > > by many threads and the merge process will merge them without > > regards to > > > > > any ordering. > > > > > > > > > > > > > > > > > > > > On Wed, May 3, 2023, 1:09 PM Patrick Zhai <zhai7...@gmail.com> > > wrote: > > > > > > > > > > > For that part I'm not entirely sure, if other folks know it please > > > > chime > > > > > in > > > > > > :) > > > > > > > > > > > > On Wed, May 3, 2023 at 8:48 AM Wei <weiwan...@gmail.com> wrote: > > > > > > > > > > > > > Thanks Patrick! In the default case when no LeafSorter is > > provided, > > > > are > > > > > > the > > > > > > > segments traversed in the order of creation time, i.e. the oldest > > > > > segment > > > > > > > is always visited first? > > > > > > > > > > > > > > Wei > > > > > > > > > > > > > > On Tue, May 2, 2023 at 7:22 PM Patrick Zhai <zhai7...@gmail.com> > > > > > wrote: > > > > > > > > > > > > > > > Hi Wei, > > > > > > > > Lucene in general iterate through the index in the order of > > what is > > > > > > > > recorded in the SegmentInfos > > > > > > > > < > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/index/SegmentInfos.java#L140 > > > > > > > > > > > > > > > > > And at search time, you can specify the order using LeafSorter > > > > > > > > < > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/index/DirectoryReader.java#L75 > > > > > > > > > > > > > > > > > when you're opening the IndexReader > > > > > > > > > > > > > > > > Patrick > > > > > > > > > > > > > > > > On Tue, May 2, 2023 at 5:28 PM Wei <weiwan...@gmail.com> > > wrote: > > > > > > > > > > > > > > > > > Hello, > > > > > > > > > > > > > > > > > > We have a index that has multiple segments generated with > > > > > continuous > > > > > > > > > updates. Does Lucene have a specific order when iterate > > through > > > > > the > > > > > > > > > segments (assuming single query thread) ? Can the order be > > > > > customized > > > > > > > > that > > > > > > > > > the latest generated segments are searched first? > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > Wei > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org