Abacn commented on issue #30646: URL: https://github.com/apache/beam/issues/30646#issuecomment-2080102737
After #31096, the client side throttling now work with ( Storage read API v2 stream (#28778) + Dataflow legacy runner). There are still many caveats However, for the default read API v1 stream, it appears the API call waiting on retry won't temporarily release the concurrent stream quota, so [hasNext](https://github.com/apache/beam/blob/68f6b5515411852640f506716e84b2b97f977ef5/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryStorageStreamSource.java#L242) call can be blocked very long until the metrics get reported back to the work item thread. The pipeline do not upscale, but it stuck indefinitely (probably until exhausted retry) It seems also not effective on Dataflow runner v2 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
