Hey folks, While investigating a regression in OpenSearch versions 2.17.1 ( Lucene 9.11.1 ) and 2.18.0 ( Lucene 9.12.0 ) for simple Term Query in Big5 workload over process.name field, I noticed that the new Lucene912PostingsReader creates the ImpactsEnum by wrapping SlowImpactsEnum over postings when a field only has IndexOptions.DOCS
curl -X POST "http://localhost:9200/big5/_search" -H "Content-Type: application/json" -d '{ "query": { "term": { "process.name": "kernel" } } }' Lucene912PostingsReader ->> ImpactsEnum impacts(FieldInfo fieldInfo, BlockTermState state, int flags) Has an extra check on *indexHasFreqs* if (state.docFreq >= BLOCK_SIZE && indexHasFreqs && (indexHasPositions == false || PostingsEnum.featureRequested(flags, PostingsEnum.POSITIONS) == false)) { return new BlockImpactsDocsEnum(fieldInfo, (IntBlockTermState) state); } Whereas Lucene99PostingsReader creates the faster BlockImpactsDocsEnum for fields with IndexOptions.DOCS and only creates the SlowImpactsEnum when document frequency is less than 128 ( Block size ) Lucene99PostingsReader ->> ImpactsEnum impacts(FieldInfo fieldInfo, BlockTermState state, int flags) if (state.docFreq <= BLOCK_SIZE) { // no skip data return new SlowImpactsEnum(postings(fieldInfo, state, null, flags)); } if (indexHasPositions == false || PostingsEnum.featureRequested(flags, PostingsEnum.POSITIONS) == false) { return new BlockImpactsDocsEnum(fieldInfo, (IntBlockTermState) state); } Since Lucene 9.12.0 wraps a SlowImpactsEnum which has a no-op for advanceShallow method, the Term Query is never able to skip data when called from the bulk scorer via DISI#nextDoc() Whereas the advanceShallow gets used in Lucene 9.11.1 and skips over a lot of docs resulting in faster completion. The difference with 116 million docs of Big5 index is >200ms in Lucene 9.12.0 to <=5ms in Lucene 9.11.1 I tried reindexing the process.name into another index but with docs_and_freqs enabled and the query latency came back to normal since it uses BlockImpactsDocsEnum as its ImpactsEnum. Is this a bug in the 912 postings reader ? Or is it not possible to use the BlockImpactsDocsEnum with the new postings format ? Thanks, Aniketh