Gaurav Narula created KAFKA-19458: ------------------------------------- Summary: Successive AlterReplicaLogDirsRequest on a topic partition may leak log segments Key: KAFKA-19458 URL: https://issues.apache.org/jira/browse/KAFKA-19458 Project: Kafka Issue Type: Bug Affects Versions: 4.0.0, 3.9.1, 4.1.0 Reporter: Gaurav Narula
Successive {{AlterReplicaLogDirsRequest}} to change log directory of a given topic partition may cause log segment leak. Consider the following scenario: 1. A request tries to change the logdir for topic partition {{tp}} from {{d1}} to {{d2}}. 2. The handler invokes {{replicaManager#alterReplicaLogDirs}} 3. A future replica is created as a result of the above method invoking {{partition#maybeCreateFutureReplica}} and cleaning for {{tp}} is disabled as {{logManager#abortAndPauseCleaning}} is invoked. 4. Now, *before* the previous request is completed, let's assume another request to change the logdir from {{d2}} to {{d3}} 5. This time, {{replicaManager#alterReplicaLogDirs}}'s call to {{partition#futureReplicaDirChanged}} will return {{true}} and we remove the fetcher and future. 6. We then re-create a future by invoking {{partition.maybeCreateFutureReplica}} with {{d3}} and pause log cleaning for {{tp}} *again*. 7. {{partition#maybeReplaceCurrentWithFutureReplica}} is invoked when the future has caught up and the callback in it swaps the future log for the local log and resumes cleaning by invoking {{LogManager#resumeCleaning}}. 8. The above decrements the count in {{LogCleaningState.logCleaningPaused}} from {{2}} to {{1}}. The log segment for the discarded future is therefore leaked until a broker restart -- This message was sent by Atlassian Jira (v8.20.10#820010)