Gaurav Narula created KAFKA-19458:
-------------------------------------

             Summary: Successive AlterReplicaLogDirsRequest on a topic 
partition may leak log segments
                 Key: KAFKA-19458
                 URL: https://issues.apache.org/jira/browse/KAFKA-19458
             Project: Kafka
          Issue Type: Bug
    Affects Versions: 4.0.0, 3.9.1, 4.1.0
            Reporter: Gaurav Narula


Successive {{AlterReplicaLogDirsRequest}} to change log directory of a given 
topic partition may cause log segment leak. Consider the following scenario:

1. A request tries to change the logdir for topic partition {{tp}} from {{d1}} 
to {{d2}}.
2. The handler invokes {{replicaManager#alterReplicaLogDirs}}
3. A future replica is created as a result of the above method invoking 
{{partition#maybeCreateFutureReplica}} and cleaning for {{tp}} is disabled as 
{{logManager#abortAndPauseCleaning}} is invoked.
4. Now, *before* the previous request is completed, let's assume another 
request to change the logdir from {{d2}} to {{d3}}
5. This time, {{replicaManager#alterReplicaLogDirs}}'s call to 
{{partition#futureReplicaDirChanged}} will return {{true}} and we remove the 
fetcher and future.
6. We then re-create a future by invoking 
{{partition.maybeCreateFutureReplica}} with {{d3}} and pause log cleaning for 
{{tp}} *again*.
7. {{partition#maybeReplaceCurrentWithFutureReplica}} is invoked when the 
future has caught up and the callback in it swaps the future log for the local 
log and resumes cleaning by invoking {{LogManager#resumeCleaning}}.
8. The above decrements the count in {{LogCleaningState.logCleaningPaused}} 
from {{2}} to {{1}}. The log segment for the discarded future is therefore 
leaked until a broker restart



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to