Thank you Aaron. We use Kafka and JMS spouts and several bolts - Elastic-Search, Solr, Cassandra, Couchbase and HDFS in different scenarios and need to have the dead letter functionality for almost all these scenarios. Locally we have this functionality almost ready for writing dead-letters to Solr or Kafka. I will try to contribute the same to Storm as a PR and we can then look into adding the failing tuple as well. I agree adding the failing tuple would be somewhat more complicated.
On Tue, Sep 20, 2016 at 4:34 PM, Aaron Niskodé-Dossett <doss...@gmail.com> wrote: > I like the idea, especially if it can be implemented as generically as > possible. Ideally we could "dead letter" both the original tuple and the > tuple that itself failed. Intervening transformations could have changed > the original tuple. I realize that's adds a lot of complexity to your idea > and may not be feasible. > On Tue, Sep 20, 2016 at 1:15 AM S G <sg.online.em...@gmail.com> wrote: > > > Hi, > > > > I want to gather some thoughts on a suggestion to provide a dead-letter > > functionality common to all spouts/bolts. > > > > Currently, if any spout / bolt reports a failure, it is retried by the > > spout. > > For a single bolt-failure in a large ADG, this retry logic can cause > > several perfectly successful component to replay and yet the Tuple could > > fail exactly at the same bolt on retry. > > > > This is fine usually (if the failure was temporary, say due to a network > > glitch) but sometimes, the message is bad enough such that it should not > be > > retried but at the same time important enough that its failure should not > > be ignored. > > > > Example: ElasticSearch-bolt receiving bytes from Kafka-Spout. > > > > Most of the times, it is able to deserialize the bytes correctly but > > sometimes a badly formatted message fails to deserialize. For such cases, > > neither Kafka-Spout should retry nor ES-bolt should report a success. It > > should however be reported to the user somehow that a badly serialized > > message entered the stream. > > > > For cases like temporary network glitch, the Tuple should be retried. > > > > So the proposal is to implement a dead-letter topic as: > > > > 1) Add a new method *failWithoutRetry(Tuple, Exception)* in the > collector. > > Bolts will begin using it once its available for use. > > > > 2) Provide the ability to *configure a dead-letter data-store in the > > spout* for > > failed messages reported by #1 above. > > > > > > The configurable data-store should support kafka, solr and redis to > > begin-with (Plus the option to implement one's own and dropping a jar > file > > in the classpath). > > > > Such a feature should benefit all the spouts as: > > > > 1) Topologies will not block replaying the same doomed-to-fail tuples. > > 2) Users can set alerts on dead-letters and find out easily actual > problems > > in their topologies rather than analyze all failed tuples only to find > that > > they failed because of a temporary network glitch. > > 3) Since the entire Tuple is put into the dead-letter, all the data is > > available for retrying after fixing the topology code. > > > > Please share your thoughts if you think it can benefit storm in a generic > > way. > > > > Thx, > > SG > > >