GSharayu opened a new issue #6793:
URL: https://github.com/apache/incubator-pinot/issues/6793


   A Lucene index is likely to be composed of multiple sub-indexes (also called 
as segments in Lucene terminology). Each sub-index is an independent searchable 
index. The documents stored in each sub-index have docIDs relative to that 
sub-index. So if we have total 400 docs in Pinot table column with text index 
and the underlying Lucene index has 2 sub-indexes with 200 docs each, the 
Lucene docIDs will be 0 to 199 for each sub-index. 
   
   The search operation on Lucene calls our collector callback with the 
matching Lucene docID. There is a bug in this code because the matching Lucene 
docID passed to the collector is relative to the sub-index. So if the 5th 
document in second sub-index got matched, collector will get docID as 4, but it 
should be 200 + 4 = 204 to get the absolute Lucene docID across sub-indexes. 
Without doing this, we will match the wrong document. 
   
   The bug was earlier fixed in offline text index search code path. Similar 
fix needs to be made in realtime text index search code path


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to