Thanks for the reply. In either of these cases, shouldn't storm stop letting the spout emit tuples once max_spout_pending is reached? In that case, the tuples already in the topology (or dropped by accident, collected in a bolt, etc) will take 5 minutes to time out, and the number of tuples failing in this way will be limited to max_spout_pending per 5 minutes. The issue is we are seeing a much higher level of spout failures.
On Wed, Jul 27, 2016 at 3:48 PM, Igor Kuzmenko <[email protected]> wrote: > We have such fails with two reasons: > > 1) Bolt doesn't ack tuple immidiatly, but collects a batch and at some > point ack's them all. In that case thes situation when batch bigger than > max_spout_pending and some tuples fails. > > 2) Bolt doesn't ack tuple at all. Make sure Bolt acks or fails tuples > without any exclusions. > > On Wed, Jul 27, 2016 at 10:22 PM, Kevin Peek <[email protected]> wrote: > >> We have a topology that is experiencing massive amounts of spout failures >> without corresponding bolt failures. We have been interpreting these as >> tuple timeouts, but we seem to be getting more of these failures than we >> understand to be possible with timeouts. >> >> Our topology uses a Kafka spout and the topology is configured with: >> topology.message.timeout.secs = 300 >> topology.max.spout.pending = 2500 >> >> Based on these settings, I would expect the topology to experience a >> maximum of 2500 tuple timeouts per 300 seconds. But from the Storm UI, we >> see that after running for about 10 minutes, the topology will show about >> 50K spout failures and zero bolt failures. >> >> Am I misunderstanding something that would allow more tuples to time out, >> or is there another source of spout failures? >> >> Thanks in advance, >> Kevin Peek >> > >
