Jake Maes created SAMZA-1393:
--------------------------------
Summary: Asynchronous SystemProducer API
Key: SAMZA-1393
URL: https://issues.apache.org/jira/browse/SAMZA-1393
Project: Samza
Issue Type: Bug
Reporter: Jake Maes
We've encountered a number of issues with the current SystemProducer interface
with async processors. For example, see SAMZA-1392.
Most of the issues arise from the fact that there's no way for us to notify the
user when there is an async error in from the producer. Consider the
KafkaSystemProducer. If the KafkaProducer callback returns an exception, the
KafkaSystemProducer has to track it and find an appropriate time to handle the
exception. Depending on where the exception is handled and whether it is
swallowed the OffsetManager will need to be overly conservative to ensure we do
not checkpoint offsets until the corresponding send() was successful. Sometimes
whole batches of messages will fail, but only a single exception is thrown, and
there's no context from which send() the exception originated. This will always
be unintuitive to the user unless they are synchronously notified of the
exception.
There hasn't been much planning on this yet, but the idea is to have a send()
method that takes a callback which is called directly from the producer
callback. So the callback would flow something like this:
producer.callback -> user.callback -> asynctask.callback -> (throw error |
offsetmanager.update)
Some care needs to be taken to ensure that there are no gaps in the offsets
recorded by the OffsetManager. Also, it's not yet clear how to enable users to
retry a failed send.
NOTE: depending on the implementation, the changes in SAMZA-1384 may need to be
reverted after this feature is implemented to ensure that the latest offsets
are checkpointed.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)