Jake Maes created SAMZA-1392:
--------------------------------

             Summary: KafkaSystemProducer performance and correctness with 
concurrent sends and flushes
                 Key: SAMZA-1392
                 URL: https://issues.apache.org/jira/browse/SAMZA-1392
             Project: Samza
          Issue Type: Bug
            Reporter: Jake Maes
            Assignee: Jake Maes
             Fix For: 0.14.0


There are 2 issues we need to fix in the KafkaSystemProducer when sends and 
flushes are called concurrently:
1. Concurrent sends contend for the sendlock, especially when producer 
compression is enabled. The fix is to use the producer.flush() API, which kafka 
has supported since at least version 0.9.x. This way we won't need to track the 
latest future, so we won't need the lock.

2. When task.async.commit is enabled, the threads calling send() could set the 
exceptionInCallback to null before the exception is handled in user code or 
flush(). This could allow us to checkpoint offsets for which the corresponding 
output was not successfully sent.
The short term solution here is to only handle the callback exceptions from 
flush() and allow users to configure the exceptions as ignorable in case they 
don't want flush to fail.
The long term solution is to support a fully asynchronous SystemProducer. 
Ticket coming for this soon.

I found issue #2 while working on issue #1, so while they're separate issues, 
it's easier to fix them with one ticket/patch.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to