[PR] Invert eligible BinaryColumns in the column pass [lucene]

via GitHub Sun, 24 May 2026 11:55:44 -0700


Tim-Brooks opened a new pull request, #16116:
URL: https://github.com/apache/lucene/pull/16116


   BinaryColumns that are DOCS or DOCS_AND_FREQS, omitNorms, no TVs, and
   not stored are now inverted directly in the column-oriented pass via
   processBinaryColumnInvert, walking each column's cursor sparsely. When
   any row-mode column is present in the batch, eligible columns are
   demoted to the row pass so all inverted fields share a single
   termsHash frame per doc.
   
   processBinaryColumnInvert skips termsHash.startDocument/finishDocument:
   those drive TermVectorsConsumer's segment-scoped lastDocID, and framing
   the same batch doc from multiple eligible columns would break its
   monotonicity invariant. Eligibility guarantees doVectors=false, so TV
   state is never touched; unframed docs are reconciled by
   TermVectorsConsumer.fill(). Exception handling mirrors processDocument:
   pf.finish runs only if the first pf.invert returned normally.
   
   The validation pass caches each column's PerField in docFields[] by
   original column position; the row pass tracks original indices in
   rowPfIndices[] instead of overwriting docFields[], so both passes can
   reuse the cache without a second hash lookup.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] Invert eligible BinaryColumns in the column pass [lucene]

Reply via email to