How about writing to a buffer ? Then you would flush the buffer to Kafka if and only if the output operation reports successful completion. In the event of a worker failure, that would not happen.
— FG On Sun, Nov 30, 2014 at 2:28 PM, Josh J <joshjd...@gmail.com> wrote: > Is there a way to do this that preserves exactly once semantics for the > write to Kafka? > On Tue, Sep 2, 2014 at 12:30 PM, Tim Smith <secs...@gmail.com> wrote: >> I'd be interested in finding the answer too. Right now, I do: >> >> val kafkaOutMsgs = kafkInMessages.map(x=>myFunc(x._2,someParam)) >> kafkaOutMsgs.foreachRDD((rdd,time) => { rdd.foreach(rec => { >> writer.output(rec) }) } ) //where writer.ouput is a method that takes a >> string and writer is an instance of a producer class. >> >> >> >> >> >> On Tue, Sep 2, 2014 at 10:12 AM, Massimiliano Tomassi < >> max.toma...@gmail.com> wrote: >> >>> Hello all, >>> after having applied several transformations to a DStream I'd like to >>> publish all the elements in all the resulting RDDs to Kafka. What the best >>> way to do that would be? Just using DStream.foreach and then RDD.foreach ? >>> Is there any other built in utility for this use case? >>> >>> Thanks a lot, >>> Max >>> >>> -- >>> ------------------------------------------------ >>> Massimiliano Tomassi >>> ------------------------------------------------ >>> e-mail: max.toma...@gmail.com >>> ------------------------------------------------ >>> >> >>