All, I followed the back port of storm-kafka-client from 1.1.x to work on Storm 1.0.x using the Maven override of the Timer class in future Storm.
It looks like the Metron KafkaSpoutConfig.Builder (a little difficult to follow the config build up so I might be wrong) is resulting in in a NamedSubscription which uses the consumer.subscribe API. Wanted to let you know that this API isn't a good fit for Storm since partition reassignment is left to Kafka vs manual assignment within Storm. What we are seeing is about 1%-2% duplicate count across two spout instances just during initial topology startup. Consider this situation: Two Kafka partitions and two spout tasks. Task id 1 starts first and polls Kafka which is then assigned all partitions. Milliseconds to two seconds later, task id 2 starts up which forces a Kafka partition rebalance across both tasks. Task id 1 will have already played tuples through the topology but can no longer commit for them. Task id 2 will potentially replay the same tuples again post rebalance with the situation worsening as the number of spout tasks increase. If the topology should use multiple tasks per executor then there is a potential for a blocking action which would then cause other consumers in the same executor (if 1:1 task/executor is not being used) to poll timeout and it rolls on from there during partition reassignments post startup. It looks like there is a ManualPartitionNamedSubscription with a RoundRobinManualPartitioner but I have yet to get it to work. This JIRA captures the problem: https://issues.apache.org/jira/browse/STORM-2542 and there are other, competing PRs/JIRAs now as well. Based on a current Storm dev discussion, they are looking to address this in both master and 1.x by converting to the 'assign' API which directs exact partitions to consumers and defaulting to this Kafka API going forward. There is some discussion there about Flux compatibility with the builder methods and I thought you all could use visibility on that in particular. Kris