[ 
https://issues.apache.org/jira/browse/SAMZA-1069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15773808#comment-15773808
 ] 

ASF GitHub Bot commented on SAMZA-1069:
---------------------------------------

Github user asfgit closed the pull request at:

    https://github.com/apache/samza/pull/37


> Deadlock between KafkaSystemProducer and KafkaProducer from kafka-clients lib
> -----------------------------------------------------------------------------
>
>                 Key: SAMZA-1069
>                 URL: https://issues.apache.org/jira/browse/SAMZA-1069
>             Project: Samza
>          Issue Type: Bug
>    Affects Versions: 0.11.0
>            Reporter: Yi Pan (Data Infrastructure)
>            Assignee: Xinyu Liu
>             Fix For: 0.12.0
>
>
> We have identified one deadlock scenario between the main thread that calls 
> KafkaSystemProducer.close() vs the KafkaProducer client lib's network thread 
> that calls the callback function within KafkaSystemProducer.send().
> The scenario is the following:
> # SamzaContainer main thread caught an exception from previous commit and 
> container initiated shutdown, which calls KafkaSystemProducer.stop(), 
> grabbing the synchronized producerLock in KafkaSystemProducer and call 
> KafkaProducer.flush() to wait for all pending requests to be done.
> # KafkaProducer network I/O thread then calls KafkaSystemProducer’s callback 
> function (in RecordBatch.done()), which is waiting on the same producerLock 
> in KafkaSystemProducer before it can return and call producerFuture.done() 
> and release the CountDownLatch that the main thread 
> KafkaSystemProducer.close() is waiting on. Hence, deadlock!
> We need to make sure the KafkaSystemProducer.close() won't have race 
> condition w/ the callbacks triggered by the KafkaProducer's network thread.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to