anuragrai16 opened a new pull request, #18885: URL: https://github.com/apache/pinot/pull/18885
When the realtime -> immutable conversion path runs with `reuseMutableIndex`, `convertMutableSegment` copies the mutable Lucene index to the v1 destination and opens the writer with CREATE_OR_APPEND. If a prior conversion attempt crashed or was killed mid-merge, the destination can hold leftover Lucene segments that FileUtils.copyDirectory preserves. CREATE_OR_APPEND then opens the highest segments_N file - which may reference the stale segments - and the resulting Lucene index ends up with a different document count than the surrounding Pinot segment. At query time DocIdTranslator's mapping buffer (sized by the segment's numDocs) throws ArrayIndexOutOfBoundsException for orphan Lucene docIDs. **Changes:** - Clean the destination directory before copying in both LuceneTextIndexCreator and MultiColumnLuceneTextIndexCreator. - - Add regression tests that prime the destination with a force-merged stale index (its bumped segments_N counter deterministically survives the copy) and assert both creators wipe and rebuild correctly. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
