Hey Richard,
> It’s sort of ‘by convention’ that they are documented in samza as
matching exactly what kafka expects, no?
Samza's KafkaSystem has three prefixes:
systems..samza.*
systems..consumer.*
systems..producer.*
The producer.* configs are passed directly to the underlying Kafka producer
Ah. Yes, I read the KafkaSystemProducer code, but did not dive deeply enough
down into the actual Kafka library implementation itself to see where these
parameters are used. It’s sort of ‘by convention’ that they are documented in
samza as matching exactly what kafka expects, no?
Richard
> O
Hi Richard,
SystemProducer is an abstraction for producers. If you are specifically
looking for Kafka as a system, you should be looking into
KafkaSystemProducer.
Depending on the Samza version you are using, you should be able to see
the code related to buffering.
If you check the RunLoop class,
> On Feb 19, 2015, at 3:46 PM, Richard Lee wrote:
>
> I’m looking at the samza-core source code… in particular RunLoop,
> TaskInstance, TaskInstanceCollector, and SystemProducers, and I’m having a
> hard time seeing where this batched sends of output messages happens. It
> seems from RunLoop
I’m looking at the samza-core source code… in particular RunLoop, TaskInstance,
TaskInstanceCollector, and SystemProducers, and I’m having a hard time seeing
where this batched sends of output messages happens. It seems from RunLoop
that:
- one input message envelope is read from the consumer
Hey Tom,
It seems that most of your questions are concerned with durability and
messaging guarantees. Samza is designed to not lose data, but duplicates
can occur. Samza reads messages, and feeds them to your process() method.
When you send messages, either via a changelog, or via collector.send,