Any more thoughts on this?
Seems like a useful feature for all the spouts/bolts.

On Wed, Sep 21, 2016 at 9:09 AM, S G <[email protected]> wrote:

> Thank you Aaron.
>
> We use Kafka and JMS spouts and several bolts - Elastic-Search, Solr,
> Cassandra, Couchbase and HDFS in different scenarios and need to have the
> dead letter functionality for almost all these scenarios.
> Locally we have this functionality almost ready for writing dead-letters to
> Solr or Kafka.
> I will try to contribute the same to Storm as a PR and we can then look
> into adding the failing tuple as well. I agree adding the failing tuple
> would be somewhat more complicated.
>
>
> On Tue, Sep 20, 2016 at 4:34 PM, Aaron Niskodé-Dossett <[email protected]>
> wrote:
>
> > I like the idea, especially if it can be implemented as generically as
> > possible. Ideally we could "dead letter" both the original tuple and the
> > tuple that itself failed. Intervening transformations could have changed
> > the original tuple. I realize that's adds a lot of complexity to your
> idea
> > and may not be feasible.
> > On Tue, Sep 20, 2016 at 1:15 AM S G <[email protected]> wrote:
> >
> > > Hi,
> > >
> > > I want to gather some thoughts on a suggestion to provide a dead-letter
> > > functionality common to all spouts/bolts.
> > >
> > > Currently, if any spout / bolt reports a failure, it is retried by the
> > > spout.
> > > For a single bolt-failure in a large ADG, this retry logic can cause
> > > several perfectly successful component to replay and yet the Tuple
> could
> > > fail exactly at the same bolt on retry.
> > >
> > > This is fine usually (if the failure was temporary, say due to a
> network
> > > glitch) but sometimes, the message is bad enough such that it should
> not
> > be
> > > retried but at the same time important enough that its failure should
> not
> > > be ignored.
> > >
> > > Example: ElasticSearch-bolt receiving bytes from Kafka-Spout.
> > >
> > > Most of the times, it is able to deserialize the bytes correctly but
> > > sometimes a badly formatted message fails to deserialize. For such
> cases,
> > > neither Kafka-Spout should retry nor ES-bolt should report a success.
> It
> > > should however be reported to the user somehow that a badly serialized
> > > message entered the stream.
> > >
> > > For cases like temporary network glitch, the Tuple should be retried.
> > >
> > > So the proposal is to implement a dead-letter topic as:
> > >
> > > 1) Add a new method *failWithoutRetry(Tuple, Exception)* in the
> > collector.
> > > Bolts will begin using it once its available for use.
> > >
> > > 2) Provide the ability to *configure a dead-letter data-store in the
> > > spout* for
> > > failed messages reported by #1 above.
> > >
> > >
> > > The configurable data-store should support kafka, solr and redis to
> > > begin-with (Plus the option to implement one's own and dropping a jar
> > file
> > > in the classpath).
> > >
> > > Such a feature should benefit all the spouts as:
> > >
> > > 1) Topologies will not block replaying the same doomed-to-fail tuples.
> > > 2) Users can set alerts on dead-letters and find out easily actual
> > problems
> > > in their topologies rather than analyze all failed tuples only to find
> > that
> > > they failed because of a temporary network glitch.
> > > 3) Since the entire Tuple is put into the dead-letter, all the data is
> > > available for retrying after fixing the topology code.
> > >
> > > Please share your thoughts if you think it can benefit storm in a
> generic
> > > way.
> > >
> > > Thx,
> > > SG
> > >
> >
>

Reply via email to