Hello, I’m trying to understand what kind of ordering guarantee is expected from the Kafka spout in case of failure. I’m using Storm 0.9.x, configuring the spout as described here http://storm.apache.org/releases/0.9.6/storm-kafka.html, with ZkHosts and only changing startOffsetTime to be LatestTime. The rest of the config is not modified and kept as default. I’m depending on: <groupId>org.apache.storm</groupId> <artifactId>storm-kafka</artifactId>
Searching the mailing list I’ve found this in a previous message, http://mail-archives.apache.org/mod_mbox/storm-user/201501.mbox/%3CCAOv%2BhsQNYp1BOACHnvzSS%2BSwS%2BKXvb8Q-7FiTWCiWqFEU0-h%2Bw%40mail.gmail.com%3E: Suppose for example your spout emits tuples A, B, C, D, E and tuple C fails.[…] KafkaSpout, on the other hand, would also re-emit all tuples after the failed tuple. So it would re-emit C, D, and E, even if D and E were successfully processed. However, I haven’t be able to reproduce this behavior in my tests. After a failure, only the failed tuple is re-emitted. In the above example, only C is re-emitted, not D and E. All the tuples are in the same kafka partition. Am I missing some config to enable this behavior or maybe is there a different implementation of the kafka spout that supports this? Thanks, Giancarlo
