The exceptions could be from bad data - i am working on it, or from
quota exceeded.
The problem is that if i have 2 collections in the pipline, and one
fails on the quota, the other will fail also, even thought it should
have succeeeded

On Sat, Sep 16, 2017 at 10:35 PM, Eugene Kirpichov
<[email protected]> wrote:
> There is no way to do catch an exception inside a transform unless you
> wrote the transform yourself and have control over the code of its DoFn's.
> That's why I'm asking whether configuring bad records would be an
> acceptable workaround.
>
> On Sat, Sep 16, 2017, 11:07 AM Chaim Turkel <[email protected]> wrote:
>
>> i am using batch, since streaming cannot be done with partitions with
>> old data more than 30 days.
>> the question is how can i catch the exception in the pipline so that
>> other collections do not fail
>>
>> On Fri, Sep 15, 2017 at 7:37 PM, Eugene Kirpichov
>> <[email protected]> wrote:
>> > Are you using streaming inserts or batch loads method for writing?
>> > If it's streaming inserts, BigQueryIO already can return the bad records,
>> > and I believe it won't fail the pipeline, so I'm assuming it's batch
>> loads.
>> > For batch loads, would it be sufficient for your purposes if
>> > BigQueryIO.read() let you configure the configuration.load.maxBadRecords
>> > parameter (see
>> https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs
>> > )?
>> >
>> > On Thu, Sep 14, 2017 at 10:29 PM Chaim Turkel <[email protected]> wrote:
>> >
>> >> I am using the sink of BigQueryIO so the example is not the same. The
>> >> example is bad data from reading, I have problems when writting. There
>> >> can be multiple errors when writing to BigQuery, and if it fails there
>> >> is no way to catch this error, and the whole pipeline fails
>> >>
>> >> chaim
>> >>
>> >> On Thu, Sep 14, 2017 at 5:48 PM, Reuven Lax <[email protected]>
>> >> wrote:
>> >> > What sort of error? You can always put a try/catch inside your DoFns
>> to
>> >> > catch the majority of errors. A common pattern is to save records that
>> >> > caused exceptions out to a separate output so you can debug them. This
>> >> blog
>> >> > post
>> >> > <
>> >>
>> https://cloud.google.com/blog/big-data/2016/01/handling-invalid-inputs-in-dataflow
>> >> >
>> >> > explained
>> >> > the pattern.
>> >> >
>> >> > Reuven
>> >> >
>> >> > On Thu, Sep 14, 2017 at 1:43 AM, Chaim Turkel <[email protected]>
>> wrote:
>> >> >
>> >> >> Hi,
>> >> >>
>> >> >>   In one pipeline I have multiple PCollections. If I have an error on
>> >> >> one then the whole pipline is canceled, is there a way to catch the
>> >> >> error and log it, and for all other PCollections to continue?
>> >> >>
>> >> >>
>> >> >> chaim
>> >> >>
>> >>
>>

Reply via email to