[
https://issues.apache.org/jira/browse/BEAM-14364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17545562#comment-17545562
]
Beam JIRA Bot commented on BEAM-14364:
--------------------------------------
This issue is assigned but has not received an update in 30 days so it has been
labeled "stale-assigned". If you are still working on the issue, please give an
update and remove the label. If you are no longer working on the issue, please
unassign so someone else may work on it. In 7 days the issue will be
automatically unassigned.
> 404s in BigQueryIO don't get output to Failed Inserts PCollection
> -----------------------------------------------------------------
>
> Key: BEAM-14364
> URL: https://issues.apache.org/jira/browse/BEAM-14364
> Project: Beam
> Issue Type: Bug
> Components: io-py-gcp
> Reporter: Svetak Vihaan Sundhar
> Assignee: Svetak Vihaan Sundhar
> Priority: P1
> Labels: stale-assigned
> Attachments: ErrorsInPrototypeJob.PNG
>
>
> Given that BigQueryIO is configured to use createDisposition(CREATE_NEVER),
> and the DynamicDestinations class returns "null" for a schema,
> and the table for that destination does not exist in BigQuery, When I stream
> records to BigQuery for that table, then the write should fail,
> and the failed rows should appear on the output PCollection for Failed
> Inserts (via getFailedInserts().
>
> Almost all of the time, the table exists before hand, but given that new
> tables can be created, we want this behavior to be non-explosive to the Job,
> however, what we are seeing is that processing completely stops in those
> pipelines, and eventually the jobs run out of memory. I feel that the
> appropriate action when BigQuery 404's for the table, would be to submit
> those failed TableRows to the output PCollection and continue processing as
> normal.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)