Hi Olivier, > Are you talking about the $checkpoint spout or MySpout (with the offset)?
I was referring to the user spout (MySpout in this case). > Does it mean all the emitted tuples are acked only when the $checkpoint.txId > event is ack (and so $checkpoint.txId acts as a barrier)? which means when > tuples are acked (in MySpout), I am sure a state has been checkpoint. Yes that is right. So when tuples are ack-ed in MySpout you can move your offsets. >Does it mean my checkpoint interval must be lower than the tuple timeout >(TOPOLGY_MESSAGE_TIMEOUT)? Right, if you change the defaults it should be lower than message timeout. The default checkpoint interval is 1s and message timeout is 30s. Thanks, Arun From: Olivier Mallassi Reply-To: "[email protected]" Date: Wednesday, May 18, 2016 at 12:57 AM To: "[email protected]" Subject: Re: State Checkpointing & spout state Hi Arun, Thank you for your answer. I may be able to deal with "at least once" with idempotency and a stateful bolt (need to look at in details yet) but being able to checkpoint the state of the spout would be really helpful ;) anyway. I may have missed something in the doc but I just need to clarify your phrase "It checkpoints the states of all the bolts and once that’s successful, the tuples emitted by the spout are acked" Are you talking about the $checkpoint spout or MySpout (with the offset)? Does it mean all the emitted tuples are acked only when the $checkpoint.txId event is ack (and so $checkpoint.txId acts as a barrier)? which means when tuples are acked (in MySpout), I am sure a state has been checkpointed. Does it mean my checkpoint interval must be lower than the tuple timeout (TOPOLGY_MESSAGE_TIMEOUT)? Many thanks for your help. Olivier. On Tue, May 17, 2016 at 2:12 PM, Arun Mahadevan <[email protected]> wrote: Hi Oliver, The state checkpointing currently does not checkpoint the state of the spout. It checkpoints the states of all the bolts and once that’s successful, the tuples emitted by the spout are acked. So currently it provides at-least once guarantee. In the ack method of the spout, you can update your offsets. In future we will extend state checkpointing to checkpoint the state of the spout. Thanks, Arun From: Olivier Mallassi Reply-To: "[email protected]" Date: Tuesday, May 17, 2016 at 5:29 PM To: "[email protected]" Subject: State Checkpointing & spout state Hello I would need to use the state checkpointing for recovery (btw, very useful feature). I am facing an issue regarding how to checkpoint the state of the my spout (no the checkpoint spout) as part of the "transaction". My Spout is reading from kafka (or equivalent) and so keeps an offset of the last read events. It keeps track of - the last read offset - the emitted and acknowledged events (with their associated offset). - the emitted and unack events (so they can be replayed) With state checkpointing, the bolt states will be kept but how can I keep the state of the source ? how can I ensure the spout replays events from the offset that match the checkpoint (or txid)? Is there any guarantees in storm that the acks are received in the order they are sent? Cheers. olivier.
