bruno-roustant commented on issue #701: LUCENE-8836 Optimize DocValues TermsDict to continue scanning from the last position when possible URL: https://github.com/apache/lucene-solr/pull/701#issuecomment-499546598 "can the optimization be implemented in a less-invasive manner?" Well, the change optimizes both seekExact(ord) and seekCeil(term), and it handles calls to next() to keep track of the last accessed term. This change also reduces the binary search range in TermsDict.seekTermsIndex(), by comparing to the last accessed term. I already tried to be minimal (although I added comments) so I don't think I can do with less. "How best might the performance of DocValues be evaluated?" The approach I took was to run some Lucene tests while counting the total number of seeks and terms read in the IndexInput, with and without the optimization. TestLucene70DocValuesFormat - the optimization saves 24% seeks and 15% term reads. TestDocValuesQueries - the optimization adds 0.7% seeks and 0.003% term reads. TestDocValuesRewriteMethod.testRegexps - the optimization saves 71% seeks and 82% term reads.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
