Chaim, Batch loads in BigQuery load the entire table in one operation, which means that it doesn't make sense to try and catch it; the entire table load will fail anyway.
Do you know what kind of errors you are getting? If there are malformed entries, can you add a ParDo beforehand to clean the data? Reuven On Sat, Sep 16, 2017 at 11:07 AM, Chaim Turkel <[email protected]> wrote: > i am using batch, since streaming cannot be done with partitions with > old data more than 30 days. > the question is how can i catch the exception in the pipline so that > other collections do not fail > > On Fri, Sep 15, 2017 at 7:37 PM, Eugene Kirpichov > <[email protected]> wrote: > > Are you using streaming inserts or batch loads method for writing? > > If it's streaming inserts, BigQueryIO already can return the bad records, > > and I believe it won't fail the pipeline, so I'm assuming it's batch > loads. > > For batch loads, would it be sufficient for your purposes if > > BigQueryIO.read() let you configure the configuration.load.maxBadRecords > > parameter (see https://cloud.google.com/bigquery/docs/reference/rest/ > v2/jobs > > )? > > > > On Thu, Sep 14, 2017 at 10:29 PM Chaim Turkel <[email protected]> wrote: > > > >> I am using the sink of BigQueryIO so the example is not the same. The > >> example is bad data from reading, I have problems when writting. There > >> can be multiple errors when writing to BigQuery, and if it fails there > >> is no way to catch this error, and the whole pipeline fails > >> > >> chaim > >> > >> On Thu, Sep 14, 2017 at 5:48 PM, Reuven Lax <[email protected]> > >> wrote: > >> > What sort of error? You can always put a try/catch inside your DoFns > to > >> > catch the majority of errors. A common pattern is to save records that > >> > caused exceptions out to a separate output so you can debug them. This > >> blog > >> > post > >> > < > >> https://cloud.google.com/blog/big-data/2016/01/handling- > invalid-inputs-in-dataflow > >> > > >> > explained > >> > the pattern. > >> > > >> > Reuven > >> > > >> > On Thu, Sep 14, 2017 at 1:43 AM, Chaim Turkel <[email protected]> > wrote: > >> > > >> >> Hi, > >> >> > >> >> In one pipeline I have multiple PCollections. If I have an error on > >> >> one then the whole pipline is canceled, is there a way to catch the > >> >> error and log it, and for all other PCollections to continue? > >> >> > >> >> > >> >> chaim > >> >> > >> >
