Re: [DISCUSS] Provision for dead-letter topic in storm

Ravi Sharma Tue, 27 Sep 2016 19:30:19 -0700

Yes good idea, would love to have this functionality of passing some user
defined data from bolt to spout on failures.


Ravi

On 27 Sep 2016 10:05 p.m., "Kyle Nusbaum" <[email protected]>
wrote:

> It seems to me that this can be solved by allowing a user to attach some
> arbitrary data to a call to fail(), which is passed to the spout.
> So there would be an override for fail in IOutputCollector which takes
> both the Tuple input and also some object to give to the spout. The spout's
> fail method would now accept an object as a second argument.
>
> The spout can then decide what to do about the failure based on the
> content of the object.
>
> This makes it generic, possibly useful for other things like reporting,
> etc. I only looked at the relevant code briefly, but it looks like it would
> also be relatively simple to implement. -- Kyle
>
>     On Tuesday, September 27, 2016 12:06 PM, Tech Id <
> [email protected]> wrote:
>
>
>  Any more thoughts on this?
> Seems like a useful feature for all the spouts/bolts.
>
> On Wed, Sep 21, 2016 at 9:09 AM, S G <[email protected]> wrote:
>
> > Thank you Aaron.
> >
> > We use Kafka and JMS spouts and several bolts - Elastic-Search, Solr,
> > Cassandra, Couchbase and HDFS in different scenarios and need to have the
> > dead letter functionality for almost all these scenarios.
> > Locally we have this functionality almost ready for writing dead-letters
> to
> > Solr or Kafka.
> > I will try to contribute the same to Storm as a PR and we can then look
> > into adding the failing tuple as well. I agree adding the failing tuple
> > would be somewhat more complicated.
> >
> >
> > On Tue, Sep 20, 2016 at 4:34 PM, Aaron Niskodé-Dossett <
> [email protected]>
> > wrote:
> >
> > > I like the idea, especially if it can be implemented as generically as
> > > possible. Ideally we could "dead letter" both the original tuple and
> the
> > > tuple that itself failed. Intervening transformations could have
> changed
> > > the original tuple. I realize that's adds a lot of complexity to your
> > idea
> > > and may not be feasible.
> > > On Tue, Sep 20, 2016 at 1:15 AM S G <[email protected]> wrote:
> > >
> > > > Hi,
> > > >
> > > > I want to gather some thoughts on a suggestion to provide a
> dead-letter
> > > > functionality common to all spouts/bolts.
> > > >
> > > > Currently, if any spout / bolt reports a failure, it is retried by
> the
> > > > spout.
> > > > For a single bolt-failure in a large ADG, this retry logic can cause
> > > > several perfectly successful component to replay and yet the Tuple
> > could
> > > > fail exactly at the same bolt on retry.
> > > >
> > > > This is fine usually (if the failure was temporary, say due to a
> > network
> > > > glitch) but sometimes, the message is bad enough such that it should
> > not
> > > be
> > > > retried but at the same time important enough that its failure should
> > not
> > > > be ignored.
> > > >
> > > > Example: ElasticSearch-bolt receiving bytes from Kafka-Spout.
> > > >
> > > > Most of the times, it is able to deserialize the bytes correctly but
> > > > sometimes a badly formatted message fails to deserialize. For such
> > cases,
> > > > neither Kafka-Spout should retry nor ES-bolt should report a success.
> > It
> > > > should however be reported to the user somehow that a badly
> serialized
> > > > message entered the stream.
> > > >
> > > > For cases like temporary network glitch, the Tuple should be retried.
> > > >
> > > > So the proposal is to implement a dead-letter topic as:
> > > >
> > > > 1) Add a new method *failWithoutRetry(Tuple, Exception)* in the
> > > collector.
> > > > Bolts will begin using it once its available for use.
> > > >
> > > > 2) Provide the ability to *configure a dead-letter data-store in the
> > > > spout* for
> > > > failed messages reported by #1 above.
> > > >
> > > >
> > > > The configurable data-store should support kafka, solr and redis to
> > > > begin-with (Plus the option to implement one's own and dropping a jar
> > > file
> > > > in the classpath).
> > > >
> > > > Such a feature should benefit all the spouts as:
> > > >
> > > > 1) Topologies will not block replaying the same doomed-to-fail
> tuples.
> > > > 2) Users can set alerts on dead-letters and find out easily actual
> > > problems
> > > > in their topologies rather than analyze all failed tuples only to
> find
> > > that
> > > > they failed because of a temporary network glitch.
> > > > 3) Since the entire Tuple is put into the dead-letter, all the data
> is
> > > > available for retrying after fixing the topology code.
> > > >
> > > > Please share your thoughts if you think it can benefit storm in a
> > generic
> > > > way.
> > > >
> > > > Thx,
> > > > SG
> > > >
> > >
> >
>
>

Re: [DISCUSS] Provision for dead-letter topic in storm

Reply via email to