GSharayu opened a new issue #6793: URL: https://github.com/apache/incubator-pinot/issues/6793
A Lucene index is likely to be composed of multiple sub-indexes (also called as segments in Lucene terminology). Each sub-index is an independent searchable index. The documents stored in each sub-index have docIDs relative to that sub-index. So if we have total 400 docs in Pinot table column with text index and the underlying Lucene index has 2 sub-indexes with 200 docs each, the Lucene docIDs will be 0 to 199 for each sub-index. The search operation on Lucene calls our collector callback with the matching Lucene docID. There is a bug in this code because the matching Lucene docID passed to the collector is relative to the sub-index. So if the 5th document in second sub-index got matched, collector will get docID as 4, but it should be 200 + 4 = 204 to get the absolute Lucene docID across sub-indexes. Without doing this, we will match the wrong document. The bug was earlier fixed in offline text index search code path. Similar fix needs to be made in realtime text index search code path -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
