Hello,

I’m trying to understand what kind of ordering guarantee is expected from the 
Kafka spout in case of failure.
I’m using Storm 0.9.x, configuring the spout as described here 
http://storm.apache.org/releases/0.9.6/storm-kafka.html, with ZkHosts and only 
changing startOffsetTime to be LatestTime. The rest of the config is not 
modified and kept as default.
I’m depending on:
<groupId>org.apache.storm</groupId>
<artifactId>storm-kafka</artifactId>

Searching the mailing list I’ve found this in a previous message, 
http://mail-archives.apache.org/mod_mbox/storm-user/201501.mbox/%3CCAOv%2BhsQNYp1BOACHnvzSS%2BSwS%2BKXvb8Q-7FiTWCiWqFEU0-h%2Bw%40mail.gmail.com%3E:

Suppose for example your spout emits tuples A, B, C, D, E and tuple C fails.[…] 
KafkaSpout, on the other hand, would also re-emit all tuples after the failed 
tuple. So it would re-emit C, D, and E, even if D and E were successfully 
processed.

However, I haven’t be able to reproduce this behavior in my tests. After a 
failure, only the failed tuple is re-emitted. In the above example, only C is 
re-emitted, not D and E.  All the tuples are in the same kafka partition.
Am I missing some config to enable this behavior or maybe is there a different 
implementation of the kafka spout that supports this?

Thanks,
Giancarlo

Reply via email to