Re: Samza commit guarantees

2015-02-20 Thread Chris Riccomini
Hey Richard, > It’s sort of ‘by convention’ that they are documented in samza as matching exactly what kafka expects, no? Samza's KafkaSystem has three prefixes: systems..samza.* systems..consumer.* systems..producer.* The producer.* configs are passed directly to the underlying Kafka producer

Re: Samza commit guarantees

2015-02-19 Thread Richard Lee
Ah. Yes, I read the KafkaSystemProducer code, but did not dive deeply enough down into the actual Kafka library implementation itself to see where these parameters are used. It’s sort of ‘by convention’ that they are documented in samza as matching exactly what kafka expects, no? Richard > O

Re: Samza commit guarantees

2015-02-19 Thread Navina Ramesh
Hi Richard, SystemProducer is an abstraction for producers. If you are specifically looking for Kafka as a system, you should be looking into KafkaSystemProducer. Depending on the Samza version you are using, you should be able to see the code related to buffering. If you check the RunLoop class,

Re: Samza commit guarantees

2015-02-19 Thread Richard Lee
> On Feb 19, 2015, at 3:46 PM, Richard Lee wrote: > > I’m looking at the samza-core source code… in particular RunLoop, > TaskInstance, TaskInstanceCollector, and SystemProducers, and I’m having a > hard time seeing where this batched sends of output messages happens. It > seems from RunLoop

Re: Samza commit guarantees

2015-02-19 Thread Richard Lee
I’m looking at the samza-core source code… in particular RunLoop, TaskInstance, TaskInstanceCollector, and SystemProducers, and I’m having a hard time seeing where this batched sends of output messages happens. It seems from RunLoop that: - one input message envelope is read from the consumer

Re: Samza commit guarantees

2015-02-19 Thread Chris Riccomini
Hey Tom, It seems that most of your questions are concerned with durability and messaging guarantees. Samza is designed to not lose data, but duplicates can occur. Samza reads messages, and feeds them to your process() method. When you send messages, either via a changelog, or via collector.send,