tlrx opened a new pull request, #16281: URL: https://github.com/apache/lucene/pull/16281
This pull request follows the discussion on #16086. It adds a new `CodecUtil.checksumEntireFile(IndexInput, OneMerge)` that periodically checks whether the merge has been aborted while reading through the file. This avoids spending time checksumming large files when the merge is already cancelled. The check runs every 1 MB. This idea was suggested by Robert (thanks!). It also adds a `checkIntegrity(OneMerge)` to all codec reader/producer base classes, with a default implementation that delegates to the existing `checkIntegrity()`. I think I updated all current codec implementations to propagate the merge to CodecUtil.checksumEntireFile(IndexInput, OneMerge). An alternative change would be to make `checkIntegrity(OneMerge)` abstract and `checkIntegrity()` final, delegating to `checkIntegrity(null)`. That would prevent subclasses from silently ignoring the merge parameter but it would require updating all (backwards and third party) codec implementations. Or just replace `checkIntegrity()` by `checkIntegrity(OneMerge)`. Happy to hear our thoughts on this. Close #13354 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
