[
https://issues.apache.org/jira/browse/BEAM-5514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16650805#comment-16650805
]
Kevin Peterson commented on BEAM-5514:
--------------------------------------
>From the docs on that page, under the streaming inserts section, it specifies
>the failure as "quotaExceeded" for rate limit failures.
In this particular case, I see both "rateLimitExceeded" errors and
"quotaExceeded" errors in the logs. I'm not 100% clear on when BQ sends which
one, but my impression is that "rateLimitExceeded" is more short term, while
"quotaExceeded" is longer term.
The rate limit ones are retried within the client using backoff, while the
others are retried via the worker retry mechanism (streaming pipeline, so
forever), and the issue is that the worker mechanism doesn't do exponential
backoff, so the retries happen much too quickly. In my observations, once you
start getting quotaExceeded errors, you pretty much continue to get them
instead of the rateLimitErrors - so the client never goes back into the backoff
part of the loop and just fails/retries right away.
I'd say the current implementation is probably correct from an HTTP error codes
standpoint, but it has the side effect of retrying errors much too quickly,
which is not great for the BQ frontend. There are 3 solutions I can think of:
1. Handle quotaExceeded within the client with a backoff retry.
2. Retry worker failure in dataflow with a backoff (maybe only for sinks?).
3. Do nothing, and rely on users to notice the errors and stop the pipeline
until quotas are increased.
I'd advocate for #1.
> BigQueryIO doesn't handle quotaExceeded errors properly
> -------------------------------------------------------
>
> Key: BEAM-5514
> URL: https://issues.apache.org/jira/browse/BEAM-5514
> Project: Beam
> Issue Type: Bug
> Components: io-java-gcp
> Reporter: Kevin Peterson
> Assignee: Reuven Lax
> Priority: Major
>
> When exceeding a streaming quota for BigQuery insertAll requests, BigQuery
> returns a 403 with reason "quotaExceeded".
> The current implementation of BigQueryIO does not consider this to be a rate
> limited exception, and therefore does not perform exponential backoff
> properly, leading to repeated calls to BQ.
> The actual error is in the
> [ApiErrorExtractor|https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryServicesImpl.java#L739]
> class, which is called from
> [BigQueryServicesImpl|https://github.com/GoogleCloudPlatform/bigdata-interop/blob/master/util/src/main/java/com/google/cloud/hadoop/util/ApiErrorExtractor.java#L263]
> to determine how to retry the failure.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)