Yes good idea, would love to have this functionality of passing some user defined data from bolt to spout on failures.
Ravi On 27 Sep 2016 10:05 p.m., "Kyle Nusbaum" <[email protected]> wrote: > It seems to me that this can be solved by allowing a user to attach some > arbitrary data to a call to fail(), which is passed to the spout. > So there would be an override for fail in IOutputCollector which takes > both the Tuple input and also some object to give to the spout. The spout's > fail method would now accept an object as a second argument. > > The spout can then decide what to do about the failure based on the > content of the object. > > This makes it generic, possibly useful for other things like reporting, > etc. I only looked at the relevant code briefly, but it looks like it would > also be relatively simple to implement. -- Kyle > > On Tuesday, September 27, 2016 12:06 PM, Tech Id < > [email protected]> wrote: > > > Any more thoughts on this? > Seems like a useful feature for all the spouts/bolts. > > On Wed, Sep 21, 2016 at 9:09 AM, S G <[email protected]> wrote: > > > Thank you Aaron. > > > > We use Kafka and JMS spouts and several bolts - Elastic-Search, Solr, > > Cassandra, Couchbase and HDFS in different scenarios and need to have the > > dead letter functionality for almost all these scenarios. > > Locally we have this functionality almost ready for writing dead-letters > to > > Solr or Kafka. > > I will try to contribute the same to Storm as a PR and we can then look > > into adding the failing tuple as well. I agree adding the failing tuple > > would be somewhat more complicated. > > > > > > On Tue, Sep 20, 2016 at 4:34 PM, Aaron Niskodé-Dossett < > [email protected]> > > wrote: > > > > > I like the idea, especially if it can be implemented as generically as > > > possible. Ideally we could "dead letter" both the original tuple and > the > > > tuple that itself failed. Intervening transformations could have > changed > > > the original tuple. I realize that's adds a lot of complexity to your > > idea > > > and may not be feasible. > > > On Tue, Sep 20, 2016 at 1:15 AM S G <[email protected]> wrote: > > > > > > > Hi, > > > > > > > > I want to gather some thoughts on a suggestion to provide a > dead-letter > > > > functionality common to all spouts/bolts. > > > > > > > > Currently, if any spout / bolt reports a failure, it is retried by > the > > > > spout. > > > > For a single bolt-failure in a large ADG, this retry logic can cause > > > > several perfectly successful component to replay and yet the Tuple > > could > > > > fail exactly at the same bolt on retry. > > > > > > > > This is fine usually (if the failure was temporary, say due to a > > network > > > > glitch) but sometimes, the message is bad enough such that it should > > not > > > be > > > > retried but at the same time important enough that its failure should > > not > > > > be ignored. > > > > > > > > Example: ElasticSearch-bolt receiving bytes from Kafka-Spout. > > > > > > > > Most of the times, it is able to deserialize the bytes correctly but > > > > sometimes a badly formatted message fails to deserialize. For such > > cases, > > > > neither Kafka-Spout should retry nor ES-bolt should report a success. > > It > > > > should however be reported to the user somehow that a badly > serialized > > > > message entered the stream. > > > > > > > > For cases like temporary network glitch, the Tuple should be retried. > > > > > > > > So the proposal is to implement a dead-letter topic as: > > > > > > > > 1) Add a new method *failWithoutRetry(Tuple, Exception)* in the > > > collector. > > > > Bolts will begin using it once its available for use. > > > > > > > > 2) Provide the ability to *configure a dead-letter data-store in the > > > > spout* for > > > > failed messages reported by #1 above. > > > > > > > > > > > > The configurable data-store should support kafka, solr and redis to > > > > begin-with (Plus the option to implement one's own and dropping a jar > > > file > > > > in the classpath). > > > > > > > > Such a feature should benefit all the spouts as: > > > > > > > > 1) Topologies will not block replaying the same doomed-to-fail > tuples. > > > > 2) Users can set alerts on dead-letters and find out easily actual > > > problems > > > > in their topologies rather than analyze all failed tuples only to > find > > > that > > > > they failed because of a temporary network glitch. > > > > 3) Since the entire Tuple is put into the dead-letter, all the data > is > > > > available for retrying after fixing the topology code. > > > > > > > > Please share your thoughts if you think it can benefit storm in a > > generic > > > > way. > > > > > > > > Thx, > > > > SG > > > > > > > > > > >
