I believe this is BEAM-190, which is actually being worked on today. However, it will probably not be ready in time for the first stable release.
https://issues.apache.org/jira/browse/BEAM-190 On Tue, Apr 11, 2017 at 7:52 AM, Lukasz Cwik <[email protected]> wrote: > Have you thought of fetching the schema upfront from BigQuery and > prefiltering out any records in a preceeding DoFn instead of relying on > BigQuery telling you that the schema doesn't match? > > Otherwise you are correct in believing that you will need to update > BigQueryIO to have the retry/error semantics that you want. > > On Tue, Apr 11, 2017 at 1:12 AM, Josh <[email protected]> wrote: > >> What I really want to do is configure BigQueryIO to log an error and skip >> the write if it receives a 4xx response from BigQuery (e.g. element does >> not match table schema). And for other errors (e.g. 5xx) I want it to retry >> n times with exponential backoff. >> >> Is there any way to do this at the moment? Will I need to make some >> custom changes to BigQueryIO? >> >> >> >> On Mon, Apr 10, 2017 at 7:11 PM, Josh <[email protected]> wrote: >> >>> Hi, >>> >>> I'm using BigQueryIO to write the output of an unbounded streaming job >>> to BigQuery. >>> >>> In the case that an element in the stream cannot be written to BigQuery, >>> the BigQueryIO seems to have some default retry logic which retries the >>> write a few times. However, if the write fails repeatedly, it seems to >>> cause the whole pipeline to halt. >>> >>> How can I configure beam so that if writing an element fails a few >>> times, it simply gives up on writing that element and moves on without >>> affecting the pipeline? >>> >>> Thanks for any advice, >>> Josh >>> >> >> >
