[
https://issues.apache.org/jira/browse/BEAM-6831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Pablo Estrada reassigned BEAM-6831:
-----------------------------------
Assignee: Pablo Estrada
> python sdk WriteToBigQuery excessive usage of metered API
> ---------------------------------------------------------
>
> Key: BEAM-6831
> URL: https://issues.apache.org/jira/browse/BEAM-6831
> Project: Beam
> Issue Type: Bug
> Components: sdk-py-core
> Affects Versions: 2.10.0
> Reporter: Pesach Weinstock
> Assignee: Pablo Estrada
> Priority: Major
> Labels: bigquery, dataflow, gcp, python
> Attachments: apache-beam-py-sdk-gcp-bq-api-issue.png
>
>
> Right now, there is a potential issue with the python sdk where
> {{beam.io.gcp.bigquery.WriteToBigQuery}} calls the following api more often
> than needed:
> [https://www.googleapis.com/bigquery/v2/projects/<project-name>/datasets/<dataset-name>/tables/<table-name>?alt=json|https://www.googleapis.com/bigquery/v2/projects/%3Cproject-name%3E/datasets/%3Cdataset-name%3E/tables/%3Ctable-name%3E?alt=json]
> The above request falls under specific bigquery API quotas which are excluded
> from bigquery streaming inserts. When used in a streaming pipeline, we hit
> this quota pretty quickly, and cannot proceed to write any further data to
> bigquery.
> Dispositions being used are:
> * create_disposition: {{beam.io.BigQueryDisposition.CREATE_NEVER}}
> * write_disposition: {{beam.io.BigQueryDisposition.WRITE_APPEND}}
> This is currently blocking us from using bigqueryIO in a streaming pipeline
> to write to bigquery, and required us to formally request an API quota
> increase from Google to temporarily correct the situation.
> Our pipeline uses DataflowRunner. Error seen is below, and in attached
> screenshot of stackdriver trace.
> {code:java}
> "errors": [
> {
> "message": "Exceeded rate limits: too many api requests per user per
> method for this user_method. For more information, see
> https://cloud.google.com/bigquery/troubleshooting-errors",
> "domain": "usageLimits",
> "reason": "rateLimitExceeded"
> }
> ],
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)