[
https://issues.apache.org/jira/browse/BEAM-5514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665747#comment-16665747
]
Raghu Angadi edited comment on BEAM-5514 at 10/26/18 11:14 PM:
---------------------------------------------------------------
> 1. Handle quotaExceeded within the client with a backoff retry.
Agreed. Quota Excceeded should be treated same as 'Rate Limited'. I think they
are logically the same thing w.r.t BigQueryIO.
A note about worker retries : It does not do exponential backoff, but it does
wait 10 seconds before re-running a failed bundle ('work item' in Dataflow
terminology), which is actually quite high.
The main issue is with the backoff mechanism itself. 'insertAll' in BigQueryIO
uses an unlimited thread pool execute each insert from a separate thread. There
could be thousands of inserts in a bundle. The backoff is calculated for each
insert independently.. so we could 1000 threads each backing of a bit.. which
does not really help cut down the load.
Over all we should control the over all rate (by reducing both the parallism
and the frequency of retries within each thread). As such I think we could use
a smaller pool to insert, but I am not sure what the right size is. A simple
policy could be to multiply retry time by number active inserts : next_retry =
backoff(num_retries) * num_active_inserts.
was (Author: rangadi):
> 1. Handle quotaExceeded within the client with a backoff retry.
Agreed. Quota Excceeded should be treated same as 'Rate Limited'. I think they
are logically the same thing w.r.t BigQueryIO.
A note about worker retries : It does not do exponential backoff, but it does
wait 10 seconds before re-running a failed bundle ('work item' in Dataflow
terminology), which is actually quite high.
The main issue is the backoff mechanism itself. 'insertAll' in BigQueryIO uses
an unlimited thread pool execute each insert from a separate thread. There
could be thousands of inserts in a bundle. The backoff is calculated for each
insert independently.. so we could 1000 threads each backing of a bit.. which
does not really help cut down the load.
Over all we should control the over all rate (by reducing both the parallism
and the frequency of retries within each thread). As such I think we could use
a smaller pool to insert, but I am not sure what the right size is. A simple
policy could be to multiply retry time by number active inserts : next_retry =
backoff(num_retries) * num_active_inserts.
> BigQueryIO doesn't handle quotaExceeded errors properly
> -------------------------------------------------------
>
> Key: BEAM-5514
> URL: https://issues.apache.org/jira/browse/BEAM-5514
> Project: Beam
> Issue Type: Bug
> Components: io-java-gcp
> Reporter: Kevin Peterson
> Assignee: Chamikara Jayalath
> Priority: Major
>
> When exceeding a streaming quota for BigQuery insertAll requests, BigQuery
> returns a 403 with reason "quotaExceeded".
> The current implementation of BigQueryIO does not consider this to be a rate
> limited exception, and therefore does not perform exponential backoff
> properly, leading to repeated calls to BQ.
> The actual error is in the
> [ApiErrorExtractor|https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryServicesImpl.java#L739]
> class, which is called from
> [BigQueryServicesImpl|https://github.com/GoogleCloudPlatform/bigdata-interop/blob/master/util/src/main/java/com/google/cloud/hadoop/util/ApiErrorExtractor.java#L263]
> to determine how to retry the failure.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)