siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400552708
########## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ########## @@ -70,6 +76,10 @@ public LuceneTextIndexReader(String column, File segmentIndexDir) { // Disable Lucene query result cache. While it helps a lot with performance for // repeated queries, on the downside it cause heap issues. _indexSearcher.setQueryCache(null); + // TODO: consider using a threshold of num docs per segment to decide between building + // mapping file upfront on segment load v/s on-the-fly during query processing + _docIdReaderWriter = new DocIdReaderWriter(segmentIndexDir, _column, numDocs); + _docIdReaderWriter.buildDocIdMapping(numDocs); Review comment: Also, I explored doing this in TextIndexHandler. But that is not good since it requires to open the lucene index, create searcher twice (both in handler and here anyway for query processing). I think it is better to avoid that and just open the lucene index reader and searcher just once per index. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org