Re: [I] Non-insertion errors in BigQueryIO.Write cause infinite loop [beam]

via GitHub Mon, 20 May 2024 07:05:01 -0700


damccorm commented on issue #20211:
URL: https://github.com/apache/beam/issues/20211#issuecomment-2120530976


   I actually think the current behavior is correct/reasonable. When we hit a 
record that can't be inserted (for this reason or others), within BigQueryIO it 
is reasonable to fail the work item (which is what we do in batch or streaming 
mode). We could skip the local retries here, and that would slightly improve 
the experience (though this doesn't really help a ton, it just fails us 
slightly faster).
   
   At that point, the behavior of the full pipeline is dependent on the 
runner/streaming mode. Most runners fail pipelines after a few retries in batch 
mode and retry continuously in streaming mode. This is because the expectation 
for a streaming pipeline is that its more feasible to update the pipeline (or 
in this case create the table) than it is to relaunch a new pipeline. A 
streaming pipeline will never stop retrying for most runners, and that is 
intentional, regardless of error.
   
   So I don't think we should make a change here


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [I] Non-insertion errors in BigQueryIO.Write cause infinite loop [beam]

Reply via email to