I believe this is BEAM-190, which is actually being worked on today.
However, it will probably not be ready in time for the first stable release.

https://issues.apache.org/jira/browse/BEAM-190

On Tue, Apr 11, 2017 at 7:52 AM, Lukasz Cwik <[email protected]> wrote:

> Have you thought of fetching the schema upfront from BigQuery and
> prefiltering out any records in a preceeding DoFn instead of relying on
> BigQuery telling you that the schema doesn't match?
>
> Otherwise you are correct in believing that you will need to update
> BigQueryIO to have the retry/error semantics that you want.
>
> On Tue, Apr 11, 2017 at 1:12 AM, Josh <[email protected]> wrote:
>
>> What I really want to do is configure BigQueryIO to log an error and skip
>> the write if it receives a 4xx response from BigQuery (e.g. element does
>> not match table schema). And for other errors (e.g. 5xx) I want it to retry
>> n times with exponential backoff.
>>
>> Is there any way to do this at the moment? Will I need to make some
>> custom changes to BigQueryIO?
>>
>>
>>
>> On Mon, Apr 10, 2017 at 7:11 PM, Josh <[email protected]> wrote:
>>
>>> Hi,
>>>
>>> I'm using BigQueryIO to write the output of an unbounded streaming job
>>> to BigQuery.
>>>
>>> In the case that an element in the stream cannot be written to BigQuery,
>>> the BigQueryIO seems to have some default retry logic which retries the
>>> write a few times. However, if the write fails repeatedly, it seems to
>>> cause the whole pipeline to halt.
>>>
>>> How can I configure beam so that if writing an element fails a few
>>> times, it simply gives up on writing that element and moves on without
>>> affecting the pipeline?
>>>
>>> Thanks for any advice,
>>> Josh
>>>
>>
>>
>

Reply via email to