[ https://issues.apache.org/jira/browse/BEAM-14364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17550152#comment-17550152 ]
Danny McCormick commented on BEAM-14364: ---------------------------------------- This issue has been migrated to https://github.com/apache/beam/issues/21713 > 404s in BigQueryIO don't get output to Failed Inserts PCollection > ----------------------------------------------------------------- > > Key: BEAM-14364 > URL: https://issues.apache.org/jira/browse/BEAM-14364 > Project: Beam > Issue Type: Bug > Components: io-py-gcp > Reporter: Svetak Vihaan Sundhar > Assignee: Svetak Vihaan Sundhar > Priority: P1 > Labels: stale-assigned > Attachments: ErrorsInPrototypeJob.PNG > > > Given that BigQueryIO is configured to use createDisposition(CREATE_NEVER), > and the DynamicDestinations class returns "null" for a schema, > and the table for that destination does not exist in BigQuery, When I stream > records to BigQuery for that table, then the write should fail, > and the failed rows should appear on the output PCollection for Failed > Inserts (via getFailedInserts(). > > Almost all of the time, the table exists before hand, but given that new > tables can be created, we want this behavior to be non-explosive to the Job, > however, what we are seeing is that processing completely stops in those > pipelines, and eventually the jobs run out of memory. I feel that the > appropriate action when BigQuery 404's for the table, would be to submit > those failed TableRows to the output PCollection and continue processing as > normal. -- This message was sent by Atlassian Jira (v8.20.7#820007)