Re: Kafka producer sink message loss?

2016-06-07 Thread Elias Levy
On Tue, Jun 7, 2016 at 4:52 AM, Stephan Ewen wrote: > The concern you raised about the sink being synchronous is exactly what my > last suggestion should address: > > The internal state backends can return a handle that can do the sync in a > background thread. The sink would

Re: Kafka producer sink message loss?

2016-06-07 Thread Stephan Ewen
Hi Elias! The concern you raised about the sink being synchronous is exactly what my last suggestion should address: The internal state backends can return a handle that can do the sync in a background thread. The sink would continue processing messages, and the checkpoint would only be

Re: Kafka producer sink message loss?

2016-06-06 Thread Elias Levy
On Sun, Jun 5, 2016 at 3:16 PM, Stephan Ewen wrote: > You raised a good point. Fortunately, there should be a simply way to fix > this. > > The Kafka Sunk Function should implement the "Checkpointed" interface. It > will get a call to the "snapshotState()" method whenever a

Re: Kafka producer sink message loss?

2016-06-05 Thread Stephan Ewen
You raised a good point. Fortunately, there should be a simply way to fix this. The Kafka Sunk Function should implement the "Checkpointed" interface. It will get a call to the "snapshotState()" method whenever a checkpoint happens. Inside that call, it should then sync on the callbacks, and only

Kafka producer sink message loss?

2016-06-03 Thread Elias Levy
I am correct in assuming that the Kafka producer sink can lose message? I don't expect exactly-once semantics using Kafka as a sink given Kafka publishing guarantees, but I do expect at least once. I gather from reading the source that the producer is publishing messages asynchronously, as