Have you thought of fetching the schema upfront from BigQuery and
prefiltering out any records in a preceeding DoFn instead of relying on
BigQuery telling you that the schema doesn't match?

Otherwise you are correct in believing that you will need to update
BigQueryIO to have the retry/error semantics that you want.

On Tue, Apr 11, 2017 at 1:12 AM, Josh <[email protected]> wrote:

> What I really want to do is configure BigQueryIO to log an error and skip
> the write if it receives a 4xx response from BigQuery (e.g. element does
> not match table schema). And for other errors (e.g. 5xx) I want it to retry
> n times with exponential backoff.
>
> Is there any way to do this at the moment? Will I need to make some custom
> changes to BigQueryIO?
>
>
>
> On Mon, Apr 10, 2017 at 7:11 PM, Josh <[email protected]> wrote:
>
>> Hi,
>>
>> I'm using BigQueryIO to write the output of an unbounded streaming job to
>> BigQuery.
>>
>> In the case that an element in the stream cannot be written to BigQuery,
>> the BigQueryIO seems to have some default retry logic which retries the
>> write a few times. However, if the write fails repeatedly, it seems to
>> cause the whole pipeline to halt.
>>
>> How can I configure beam so that if writing an element fails a few times,
>> it simply gives up on writing that element and moves on without affecting
>> the pipeline?
>>
>> Thanks for any advice,
>> Josh
>>
>
>

Reply via email to