FrancisGodinho opened a new pull request, #20307:
URL: https://github.com/apache/kafka/pull/20307

   ## Ticket
   - https://issues.apache.org/jira/browse/KAFKA-17207
   
   ## Summary
   This PR fixes the memory leak in 
`ConnectDistributedRequestTimeoutIntegrationTest#testRequestTimeouts`, where 
the producer hangs indefinitely after the config topic is deleted.
   
   Root Cause:
   The hang occurs inside the Kafka producer's internal NetworkClient.poll() 
loop:
   
   
https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/NetworkClient.java#L645
   
   Selector.poll(...) blocks while trying to reconnect to the broker and send 
metadata requests for the deleted config topic. Since the topic is gone and the 
MetadataRecoveryStrategy is set to NONE, the metadata updater continues 
retrying, and the producer does not timeout or fail fast. This causes test 
resources to hang indefinitely.
   
   ### Why not fix Kafka core?
   While this looks like a Kafka client bug at first, it is technically 
consistent with Kafka's design: the system is eventually consistent and assumes 
topics may become available again. Therefore, hanging here is expected behavior 
unless the producer is explicitly closed or the topic reappears.
   
   Fixing this behavior in the Kafka producer would require rethinking the 
metadata handling policy or tightening timeouts, which is risky and out of 
scope for this test.
   
   ### Fix:
   I restore the deleted topic at the end of the test to allow the producer to 
recover and shut down cleanly.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to