Hi Arun,

Thank you for your answer.
I may be able to deal with "at least once" with idempotency and a stateful
bolt (need to look at  in details yet) but being able to checkpoint the
state of the spout would be really helpful  ;)

anyway. I may have missed something in the doc but I just need to clarify
your phrase "It checkpoints the states of all the bolts and once that’s
successful, the tuples emitted by the spout are acked"

Are you talking about the $checkpoint spout or MySpout (with the offset)?
Does it mean all the emitted tuples are acked only when the
$checkpoint.txId event is ack (and so $checkpoint.txId acts as a barrier)?
which means when tuples are acked (in MySpout), I am sure a state has been
checkpointed.
Does it mean my checkpoint interval must be lower than the tuple timeout
(TOPOLGY_MESSAGE_TIMEOUT)?

Many thanks for your help.

Olivier.

On Tue, May 17, 2016 at 2:12 PM, Arun Mahadevan <[email protected]> wrote:

> Hi Oliver,
>
> The state checkpointing currently does not checkpoint the state of the
> spout. It checkpoints the states of all the bolts and once that’s
> successful, the tuples emitted by the spout are acked. So currently it
> provides at-least once guarantee.
>
> In the ack method of the spout, you can update your offsets.
>
> In future we will extend state checkpointing to checkpoint the state of
> the spout.
>
> Thanks,
> Arun
>
>
> From: Olivier Mallassi
> Reply-To: "[email protected]"
> Date: Tuesday, May 17, 2016 at 5:29 PM
> To: "[email protected]"
> Subject: State Checkpointing & spout state
>
> Hello
>
> I would need to use the state checkpointing for recovery (btw, very useful
> feature). I am facing an issue regarding how to checkpoint the state of the
> my spout (no the checkpoint spout) as part of the "transaction".
>
> My Spout is reading from kafka (or equivalent) and so keeps an offset of
> the last read events.
> It keeps track of
> - the last read offset
> - the emitted and acknowledged events (with their associated offset).
> - the emitted and unack events (so they can be replayed)
>
> With state checkpointing, the bolt states will be kept but how can I keep
> the state of the source ? how can I ensure the spout replays events from
> the offset that match the checkpoint (or txid)?
> Is there any guarantees in storm that the acks are received in the order
> they are sent?
>
> Cheers.
>
> olivier.
>

Reply via email to