Re: Infinite retry in streaming - is there a workaround?

2017-10-25 Thread Aleksandr
Hello Derek,
There no general solution for failing bundle. Some kind of dataflow errors
you can fix using dataflow update feature. Another solution is to catch
exceptions in ParDo function.

25. okt 2017 9:42 PM kirjutas kuupäeval "Griselda Cuevas" :

Hi Derek, yes you can use that mailing list and also the SO channel.

Cheers,
G


> BTW, do you know if there's a Dataflow mailing list for questions like
> this? Would dataflow-feedback be the appropriate mailing list?
>
> Thanks,
>
> Derek
>
> On Wed, Oct 25, 2017 at 10:58 AM, Griselda Cuevas  wrote:
>
>> Hi Derek - It sounds like this is a Dataflow specific questions so I'd
>> recommend you also reach out through the Dataflow's Stack Overflow
>> 
>> channel. I'm also cc'ing Thomas Groh who might be able to help.
>>
>>
>>
>> On 20 October 2017 at 11:35, Derek Hao Hu  wrote:
>>
>>> ​Kindly ping as I'm really curious about this. :p
>>>
>>> Derek​
>>>
>>> On Thu, Oct 19, 2017 at 2:15 PM, Derek Hao Hu 
>>> wrote:
>>>
 Hi,

 ​We are trying to use Dataflow in Prod and right now one of our main
 concerns is this "infinite retry" behavior which might stall the whole
 pipeline.

 Right now for all the DoFns we've implemented ourselves we've added
 some error handling or exception swallowing mechanism to make sure some
 bundles can just fail and we log the exceptions. But we are a bit concerned
 about the other Beam native transforms which we can not easily wrap, e.g.
 PubSubIO transforms and DatastoreV1 transforms.

 A few days ago I asked a specific question in this group about how one
 can catch exception in DatastoreV1 transforms and the recommended approach
 is to 1) either duplicate the code in the current DatastoreV1
 implementation and swallow the exception instead of throwing or 2) Follow
 the implementation of BigQueryIO to add the ability to support custom retry
 policy. Both are feasible options but I'm a bit concerned in that doesn't
 that mean eventually all Beam native transforms need to implement something
 like 2) if we want to use them in Prod?

 So in short, I want to know right now what is the recommended approach
 or workaround to say, hey, just let this bundle fail and we can process the
 rest of the elements instead of just stall the pipeline?

 Thanks!
 --
 Derek Hao Hu

 Software Engineer | Snapchat
 Snap Inc.

>>>
>>>
>>>
>>> --
>>> Derek Hao Hu
>>>
>>> Software Engineer | Snapchat
>>> Snap Inc.
>>>
>>
>>
>
>
> --
> Derek Hao Hu
>
> Software Engineer | Snapchat
> Snap Inc.
>


Re: Infinite retry in streaming - is there a workaround?

2017-10-25 Thread Griselda Cuevas
Hi Derek, yes you can use that mailing list and also the SO channel.

Cheers,
G


> BTW, do you know if there's a Dataflow mailing list for questions like
> this? Would dataflow-feedback be the appropriate mailing list?
>
> Thanks,
>
> Derek
>
> On Wed, Oct 25, 2017 at 10:58 AM, Griselda Cuevas  wrote:
>
>> Hi Derek - It sounds like this is a Dataflow specific questions so I'd
>> recommend you also reach out through the Dataflow's Stack Overflow
>> 
>> channel. I'm also cc'ing Thomas Groh who might be able to help.
>>
>>
>>
>> On 20 October 2017 at 11:35, Derek Hao Hu  wrote:
>>
>>> ​Kindly ping as I'm really curious about this. :p
>>>
>>> Derek​
>>>
>>> On Thu, Oct 19, 2017 at 2:15 PM, Derek Hao Hu 
>>> wrote:
>>>
 Hi,

 ​We are trying to use Dataflow in Prod and right now one of our main
 concerns is this "infinite retry" behavior which might stall the whole
 pipeline.

 Right now for all the DoFns we've implemented ourselves we've added
 some error handling or exception swallowing mechanism to make sure some
 bundles can just fail and we log the exceptions. But we are a bit concerned
 about the other Beam native transforms which we can not easily wrap, e.g.
 PubSubIO transforms and DatastoreV1 transforms.

 A few days ago I asked a specific question in this group about how one
 can catch exception in DatastoreV1 transforms and the recommended approach
 is to 1) either duplicate the code in the current DatastoreV1
 implementation and swallow the exception instead of throwing or 2) Follow
 the implementation of BigQueryIO to add the ability to support custom retry
 policy. Both are feasible options but I'm a bit concerned in that doesn't
 that mean eventually all Beam native transforms need to implement something
 like 2) if we want to use them in Prod?

 So in short, I want to know right now what is the recommended approach
 or workaround to say, hey, just let this bundle fail and we can process the
 rest of the elements instead of just stall the pipeline?

 Thanks!
 --
 Derek Hao Hu

 Software Engineer | Snapchat
 Snap Inc.

>>>
>>>
>>>
>>> --
>>> Derek Hao Hu
>>>
>>> Software Engineer | Snapchat
>>> Snap Inc.
>>>
>>
>>
>
>
> --
> Derek Hao Hu
>
> Software Engineer | Snapchat
> Snap Inc.
>


Re: Infinite retry in streaming - is there a workaround?

2017-10-25 Thread Derek Hao Hu
Aha, I just realized this is just a Dataflow behavior not a Beam default
behavior. :) Thanks Griselda. I'll post in the SO channel.

BTW, do you know if there's a Dataflow mailing list for questions like
this? Would dataflow-feedback be the appropriate mailing list?

Thanks,

Derek

On Wed, Oct 25, 2017 at 10:58 AM, Griselda Cuevas  wrote:

> Hi Derek - It sounds like this is a Dataflow specific questions so I'd
> recommend you also reach out through the Dataflow's Stack Overflow
> 
> channel. I'm also cc'ing Thomas Groh who might be able to help.
>
>
>
> On 20 October 2017 at 11:35, Derek Hao Hu  wrote:
>
>> ​Kindly ping as I'm really curious about this. :p
>>
>> Derek​
>>
>> On Thu, Oct 19, 2017 at 2:15 PM, Derek Hao Hu 
>> wrote:
>>
>>> Hi,
>>>
>>> ​We are trying to use Dataflow in Prod and right now one of our main
>>> concerns is this "infinite retry" behavior which might stall the whole
>>> pipeline.
>>>
>>> Right now for all the DoFns we've implemented ourselves we've added some
>>> error handling or exception swallowing mechanism to make sure some bundles
>>> can just fail and we log the exceptions. But we are a bit concerned about
>>> the other Beam native transforms which we can not easily wrap, e.g.
>>> PubSubIO transforms and DatastoreV1 transforms.
>>>
>>> A few days ago I asked a specific question in this group about how one
>>> can catch exception in DatastoreV1 transforms and the recommended approach
>>> is to 1) either duplicate the code in the current DatastoreV1
>>> implementation and swallow the exception instead of throwing or 2) Follow
>>> the implementation of BigQueryIO to add the ability to support custom retry
>>> policy. Both are feasible options but I'm a bit concerned in that doesn't
>>> that mean eventually all Beam native transforms need to implement something
>>> like 2) if we want to use them in Prod?
>>>
>>> So in short, I want to know right now what is the recommended approach
>>> or workaround to say, hey, just let this bundle fail and we can process the
>>> rest of the elements instead of just stall the pipeline?
>>>
>>> Thanks!
>>> --
>>> Derek Hao Hu
>>>
>>> Software Engineer | Snapchat
>>> Snap Inc.
>>>
>>
>>
>>
>> --
>> Derek Hao Hu
>>
>> Software Engineer | Snapchat
>> Snap Inc.
>>
>
>


-- 
Derek Hao Hu

Software Engineer | Snapchat
Snap Inc.


Re: Infinite retry in streaming - is there a workaround?

2017-10-25 Thread Griselda Cuevas
Hi Derek - It sounds like this is a Dataflow specific questions so I'd
recommend you also reach out through the Dataflow's Stack Overflow
 channel.
I'm also cc'ing Thomas Groh who might be able to help.



On 20 October 2017 at 11:35, Derek Hao Hu  wrote:

> ​Kindly ping as I'm really curious about this. :p
>
> Derek​
>
> On Thu, Oct 19, 2017 at 2:15 PM, Derek Hao Hu 
> wrote:
>
>> Hi,
>>
>> ​We are trying to use Dataflow in Prod and right now one of our main
>> concerns is this "infinite retry" behavior which might stall the whole
>> pipeline.
>>
>> Right now for all the DoFns we've implemented ourselves we've added some
>> error handling or exception swallowing mechanism to make sure some bundles
>> can just fail and we log the exceptions. But we are a bit concerned about
>> the other Beam native transforms which we can not easily wrap, e.g.
>> PubSubIO transforms and DatastoreV1 transforms.
>>
>> A few days ago I asked a specific question in this group about how one
>> can catch exception in DatastoreV1 transforms and the recommended approach
>> is to 1) either duplicate the code in the current DatastoreV1
>> implementation and swallow the exception instead of throwing or 2) Follow
>> the implementation of BigQueryIO to add the ability to support custom retry
>> policy. Both are feasible options but I'm a bit concerned in that doesn't
>> that mean eventually all Beam native transforms need to implement something
>> like 2) if we want to use them in Prod?
>>
>> So in short, I want to know right now what is the recommended approach or
>> workaround to say, hey, just let this bundle fail and we can process the
>> rest of the elements instead of just stall the pipeline?
>>
>> Thanks!
>> --
>> Derek Hao Hu
>>
>> Software Engineer | Snapchat
>> Snap Inc.
>>
>
>
>
> --
> Derek Hao Hu
>
> Software Engineer | Snapchat
> Snap Inc.
>


Re: Infinite retry in streaming - is there a workaround?

2017-10-20 Thread Derek Hao Hu
​Kindly ping as I'm really curious about this. :p

Derek​

On Thu, Oct 19, 2017 at 2:15 PM, Derek Hao Hu 
wrote:

> Hi,
>
> ​We are trying to use Dataflow in Prod and right now one of our main
> concerns is this "infinite retry" behavior which might stall the whole
> pipeline.
>
> Right now for all the DoFns we've implemented ourselves we've added some
> error handling or exception swallowing mechanism to make sure some bundles
> can just fail and we log the exceptions. But we are a bit concerned about
> the other Beam native transforms which we can not easily wrap, e.g.
> PubSubIO transforms and DatastoreV1 transforms.
>
> A few days ago I asked a specific question in this group about how one can
> catch exception in DatastoreV1 transforms and the recommended approach is
> to 1) either duplicate the code in the current DatastoreV1 implementation
> and swallow the exception instead of throwing or 2) Follow the
> implementation of BigQueryIO to add the ability to support custom retry
> policy. Both are feasible options but I'm a bit concerned in that doesn't
> that mean eventually all Beam native transforms need to implement something
> like 2) if we want to use them in Prod?
>
> So in short, I want to know right now what is the recommended approach or
> workaround to say, hey, just let this bundle fail and we can process the
> rest of the elements instead of just stall the pipeline?
>
> Thanks!
> --
> Derek Hao Hu
>
> Software Engineer | Snapchat
> Snap Inc.
>



-- 
Derek Hao Hu

Software Engineer | Snapchat
Snap Inc.