[
https://issues.apache.org/jira/browse/BEAM-12472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sam Whittle updated BEAM-12472:
-------------------------------
Fix Version/s: 2.34.0
Resolution: Fixed
Status: Resolved (was: Open)
> BigQuery streaming writes can be batched beyond request limit with
> BatchAndInsertElements
> -----------------------------------------------------------------------------------------
>
> Key: BEAM-12472
> URL: https://issues.apache.org/jira/browse/BEAM-12472
> Project: Beam
> Issue Type: Bug
> Components: io-java-gcp
> Reporter: Sam Whittle
> Assignee: Pablo Estrada
> Priority: P2
> Labels: stale-P2
> Fix For: 2.34.0
>
>
> BatchAndInsertElements accumulates all the input elements and flushes them in
> finishBundle.
> However if there is enough data the request limit for bigquery can be
> exceeded causing an exception like the following. It seems that finishBundle
> should limit the # of rows and bytes and possibly flush multiple times for a
> destination.
> Work around would be to use autosharding which uses state that has batching
> limits or to increase the # of streaming keys to decrease the likelihood of
> hitting this.
> {code}
> Error while processing a work item: UNKNOWN:
> org.apache.beam.sdk.util.UserCodeException: java.lang.RuntimeException:
> com.google.api.client.googleapis.json.GoogleJsonResponseException: 400 Bad
> Request
> POST
> https://bigquery.googleapis.com/bigquery/v2/projects/google.com:clouddfe/datasets/nexmark_06090820455271/tables/nexmark_simple/insertAll?prettyPrint=false
> {
> "code" : 400,
> "errors" : [ {
> "domain" : "global",
> "message" : "Request payload size exceeds the limit: 10485760 bytes.",
> "reason" : "badRequest"
> } ],
> "message" : "Request payload size exceeds the limit: 10485760 bytes.",
> "status" : "INVALID_ARGUMENT"
> }
> at
> org.apache.beam.sdk.util.UserCodeException.wrap(UserCodeException.java:39)
> at
> org.apache.beam.sdk.io.gcp.bigquery.BatchedStreamingWrite$BatchAndInsertElements$DoFnInvoker.invokeFinishBundle(Unknown
> Source)
> at
> org.apache.beam.fn.harness.FnApiDoFnRunner.finishBundle(FnApiDoFnRunner.java:1661)
> {code}
--
This message was sent by Atlassian Jira
(v8.20.1#820001)