Hello all, We're investigating using Storm for some work. We have a topology using the KafkaSpout followed by two bolts - one is a processing step, followed by a kafka producer. Each is using the BaseRichBolt and anchoring and acknowledging each tuple. We run our topology using 1 worker, 10 executor and 10 tasks. 3 for the reader, enricher and producer, and 1 for the __acker. The kafkaSpout is reading from a kafka queue with 3 partitions. We are able to run our topology for about 6 hrs processing about 25MM tuples/hr - then it just stops. No tuple failures. No error messages, just the amount of emitted and ack'ed tuples drops to zero.
The bolt capacity is all under 0.5 (usually well below 0.5). Execute latency on the bolts < 1ms. We have set max.spout.pending to 1000. We are running Kafka 0.10.0.0 and Storm 1.0.1 on Ubuntu Trusty VM. The firewall is disabled. Does anyone have any suggestions why this might be happening? Also, can someone suggest various ways we could go about troubleshooting this behavior? Thanks! -William
