[ 
https://issues.apache.org/jira/browse/SAMZA-1392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16173731#comment-16173731
 ] 

ASF GitHub Bot commented on SAMZA-1392:
---------------------------------------

Github user asfgit closed the pull request at:

    https://github.com/apache/samza/pull/272


> KafkaSystemProducer performance and correctness with concurrent sends and 
> flushes
> ---------------------------------------------------------------------------------
>
>                 Key: SAMZA-1392
>                 URL: https://issues.apache.org/jira/browse/SAMZA-1392
>             Project: Samza
>          Issue Type: Bug
>            Reporter: Jake Maes
>            Assignee: Jake Maes
>             Fix For: 0.14.0
>
>         Attachments: Producer Performance Tests for SAMZA-1392 - Sheet1.pdf
>
>
> There are 2 issues we need to fix in the KafkaSystemProducer when sends and 
> flushes are called concurrently:
> 1. Concurrent sends contend for the sendlock, especially when producer 
> compression is enabled. The fix is to use the producer.flush() API, which 
> kafka has supported since at least version 0.9.x. This way we won't need to 
> track the latest future, so we won't need the lock.
> 2. When task.async.commit is enabled, the threads calling send() could set 
> the exceptionInCallback to null before the exception is handled in user code 
> or flush(). This could allow us to checkpoint offsets for which the 
> corresponding output was not successfully sent.
> The short term solution here is to only handle the callback exceptions from 
> flush() and allow users to configure the exceptions as ignorable in case they 
> don't want flush to fail.
> The long term solution is to support a fully asynchronous SystemProducer. 
> Ticket SAMZA-1393.
> I found issue #2 while working on issue #1, so while they're separate issues, 
> it's easier to fix them with one ticket/patch.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to