damccorm commented on issue #20211: URL: https://github.com/apache/beam/issues/20211#issuecomment-2120530976
I actually think the current behavior is correct/reasonable. When we hit a record that can't be inserted (for this reason or others), within BigQueryIO it is reasonable to fail the work item (which is what we do in batch or streaming mode). We could skip the local retries here, and that would slightly improve the experience (though this doesn't really help a ton, it just fails us slightly faster). At that point, the behavior of the full pipeline is dependent on the runner/streaming mode. Most runners fail pipelines after a few retries in batch mode and retry continuously in streaming mode. This is because the expectation for a streaming pipeline is that its more feasible to update the pipeline (or in this case create the table) than it is to relaunch a new pipeline. A streaming pipeline will never stop retrying for most runners, and that is intentional, regardless of error. So I don't think we should make a change here -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
