[
https://issues.apache.org/jira/browse/SAMZA-1392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16173731#comment-16173731
]
ASF GitHub Bot commented on SAMZA-1392:
---------------------------------------
Github user asfgit closed the pull request at:
https://github.com/apache/samza/pull/272
> KafkaSystemProducer performance and correctness with concurrent sends and
> flushes
> ---------------------------------------------------------------------------------
>
> Key: SAMZA-1392
> URL: https://issues.apache.org/jira/browse/SAMZA-1392
> Project: Samza
> Issue Type: Bug
> Reporter: Jake Maes
> Assignee: Jake Maes
> Fix For: 0.14.0
>
> Attachments: Producer Performance Tests for SAMZA-1392 - Sheet1.pdf
>
>
> There are 2 issues we need to fix in the KafkaSystemProducer when sends and
> flushes are called concurrently:
> 1. Concurrent sends contend for the sendlock, especially when producer
> compression is enabled. The fix is to use the producer.flush() API, which
> kafka has supported since at least version 0.9.x. This way we won't need to
> track the latest future, so we won't need the lock.
> 2. When task.async.commit is enabled, the threads calling send() could set
> the exceptionInCallback to null before the exception is handled in user code
> or flush(). This could allow us to checkpoint offsets for which the
> corresponding output was not successfully sent.
> The short term solution here is to only handle the callback exceptions from
> flush() and allow users to configure the exceptions as ignorable in case they
> don't want flush to fail.
> The long term solution is to support a fully asynchronous SystemProducer.
> Ticket SAMZA-1393.
> I found issue #2 while working on issue #1, so while they're separate issues,
> it's easier to fix them with one ticket/patch.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)