If I understand it correctly, the reason why we need to use `abortAndPauseCleaning` in LogCleanerManager is that we want to block log deletion/truncation while log-cleaner-thread is reading segments during compaction.
The race condition is caused by relying on abortAndPauseCleaning to protect concurrent access/modification on the log. For other code paths, when we want to protect concurrent operations on log, we ues the `lock` object. I may be wrong but isn't it better to replace the `lock` object with a read-write lock and acquire a read lock when log-cleaner-thread is doing the compaction? I think avoiding race conditions in the log level is better than relying on additional data structures in upper layers. We don't know whether there can be other corner cases if we don't have any protection while reading the segments. [ Full content available at: https://github.com/apache/kafka/pull/5591 ] This message was relayed via gitbox.apache.org for [email protected]
