jpountz opened a new pull request, #1052:
URL: https://github.com/apache/lucene/pull/1052

   This commit adds a new `TermsEnumIndex` abstraction in `oal.index` that 
wraps a 
   `TermsEnum` and an index of the segment that it belongs to, and can be used 
to
   create priority queues that merge TermsEnum instances (either from the 
inverted
   index or from doc values). In either case, a long that holds the first 8 
bytes
   of the term is computed in order to speed up comparisons. In the doc-values
   case, `OrdinalMap` also leverages seek-by-ord capabilities to reason about
   shared prefixes across entire windows of terms to not compare shared prefixes
   whenever re-ordering the queue, this should especially help with fields that
   may share long common prefixes like URLs.
   
   On luceneutil's `OrdinalMap` benchmark, construction time reduced by 30.5% 
for
   the `id` field and by 17.5% for the `name` field.
   
   JIRA: [LUCENE-10560](https://issues.apache.org/jira/browse/LUCENE-10560)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to