[
https://issues.apache.org/jira/browse/SAMZA-1392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16128071#comment-16128071
]
ASF GitHub Bot commented on SAMZA-1392:
---------------------------------------
GitHub user jmakes opened a pull request:
https://github.com/apache/samza/pull/272
SAMZA-1392: KafkaSystemProducer performance and correctness with conc…
…urrent sends and flushes
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/jmakes/samza samza-1392
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/samza/pull/272.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #272
----
commit a2f39392abc71fd1fe4221b2836d46fa78642502
Author: Jacob Maes <[email protected]>
Date: 2017-08-15T23:36:16Z
SAMZA-1392: KafkaSystemProducer performance and correctness with concurrent
sends and flushes
----
> KafkaSystemProducer performance and correctness with concurrent sends and
> flushes
> ---------------------------------------------------------------------------------
>
> Key: SAMZA-1392
> URL: https://issues.apache.org/jira/browse/SAMZA-1392
> Project: Samza
> Issue Type: Bug
> Reporter: Jake Maes
> Assignee: Jake Maes
> Fix For: 0.14.0
>
>
> There are 2 issues we need to fix in the KafkaSystemProducer when sends and
> flushes are called concurrently:
> 1. Concurrent sends contend for the sendlock, especially when producer
> compression is enabled. The fix is to use the producer.flush() API, which
> kafka has supported since at least version 0.9.x. This way we won't need to
> track the latest future, so we won't need the lock.
> 2. When task.async.commit is enabled, the threads calling send() could set
> the exceptionInCallback to null before the exception is handled in user code
> or flush(). This could allow us to checkpoint offsets for which the
> corresponding output was not successfully sent.
> The short term solution here is to only handle the callback exceptions from
> flush() and allow users to configure the exceptions as ignorable in case they
> don't want flush to fail.
> The long term solution is to support a fully asynchronous SystemProducer.
> Ticket SAMZA-1393.
> I found issue #2 while working on issue #1, so while they're separate issues,
> it's easier to fix them with one ticket/patch.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)