Ahh, sorry, I misunderstood the problem.
Does the application use InMemoryKeyValueStore to accumulate the deltas? If
so, I would think you could enable the changelog on that store and commit
as often as you like, because having the deltas backed up durably should
allow you to decouple commit()
Thanks Jake.
In this application we control the checkpointing explicitly to accumulate
certain amount of delta in memory before committing them to stores, and
checkpointing. This is to reduce the commit counts and some other business
case like deduplication of deltas.
The scenario that I was
Hey Gaurav,
Samza automatically keeps track of the offsets your job has successfully
processed for each SSP. When your task requests a checkpoint, Samza will
write the offset of the latest successfully-processed message for each SSP
that task consumes.
So if task0 consumes partition 0 of two
Thanks, I'll check it out.
I have a samza application that is consuming a lot of different types of
messages (these messages are related to each other but do not require join
- think of these like different configuration and metric information of
virtual machines that modify some central sates
In Samza, the logical unit of processing (and hence, checkpointing) is a
task. Hence, you cannot selectively checkpoint SSPs within a task.
However, you can configure how you group your SSPs into tasks by choosing a
Grouper. If you want to control checkpointing at the granularity of an SSP,
then
Hi All,
If I had Samza Tasks that were consuming message from multiple topics, how
would checkpoint/commit work in that case? On calling
taskCordinator.commit(), would current offset of all topics be saved for
the caller task (only the partitions assigned to the caller task)? Is
there a way to