Thanks for the replies,
@Lukasz that sounds like a good option. It's just it may be hard to catch
and filter out every case that will result in a 4xx error. I just want to
avoid the whole pipeline failing in the case of a few elements in the
stream being bad.
@Dan that sounds promising, I will kee
I believe this is BEAM-190, which is actually being worked on today.
However, it will probably not be ready in time for the first stable release.
https://issues.apache.org/jira/browse/BEAM-190
On Tue, Apr 11, 2017 at 7:52 AM, Lukasz Cwik wrote:
> Have you thought of fetching the schema upfront
Have you thought of fetching the schema upfront from BigQuery and
prefiltering out any records in a preceeding DoFn instead of relying on
BigQuery telling you that the schema doesn't match?
Otherwise you are correct in believing that you will need to update
BigQueryIO to have the retry/error seman
What I really want to do is configure BigQueryIO to log an error and skip
the write if it receives a 4xx response from BigQuery (e.g. element does
not match table schema). And for other errors (e.g. 5xx) I want it to retry
n times with exponential backoff.
Is there any way to do this at the moment
Hi,
I'm using BigQueryIO to write the output of an unbounded streaming job to
BigQuery.
In the case that an element in the stream cannot be written to BigQuery,
the BigQueryIO seems to have some default retry logic which retries the
write a few times. However, if the write fails repeatedly, it se