Hi,

​We are trying to use Dataflow in Prod and right now one of our main
concerns is this "infinite retry" behavior which might stall the whole
pipeline.

Right now for all the DoFns we've implemented ourselves we've added some
error handling or exception swallowing mechanism to make sure some bundles
can just fail and we log the exceptions. But we are a bit concerned about
the other Beam native transforms which we can not easily wrap, e.g.
PubSubIO transforms and DatastoreV1 transforms.

A few days ago I asked a specific question in this group about how one can
catch exception in DatastoreV1 transforms and the recommended approach is
to 1) either duplicate the code in the current DatastoreV1 implementation
and swallow the exception instead of throwing or 2) Follow the
implementation of BigQueryIO to add the ability to support custom retry
policy. Both are feasible options but I'm a bit concerned in that doesn't
that mean eventually all Beam native transforms need to implement something
like 2) if we want to use them in Prod?

So in short, I want to know right now what is the recommended approach or
workaround to say, hey, just let this bundle fail and we can process the
rest of the elements instead of just stall the pipeline?

Thanks!
-- 
Derek Hao Hu

Software Engineer | Snapchat
Snap Inc.

Reply via email to