Abacn commented on code in PR #31125:
URL: https://github.com/apache/beam/pull/31125#discussion_r1581648996


##########
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryStorageStreamSource.java:
##########
@@ -388,8 +398,22 @@ public synchronized BigQueryStorageStreamSource<T> 
getCurrentSource() {
           // The following line is required to trigger the 
`FailedPreconditionException` on which
           // the SplitReadStream validation logic depends. Removing it will 
cause incorrect
           // split operations to succeed.
-          newResponseIterator.hasNext();
-          storageClient.reportPendingMetrics();
+          Future<Boolean> future = 
executor.submit(newResponseIterator::hasNext);
+          try {
+            // The intended wait time is in sync with 
splitReadStreamSettings.setRetrySettings in
+            // StorageClientImpl.
+            future.get(30, TimeUnit.SECONDS);

Review Comment:
   This is correspond to the config here
   
   
https://github.com/apache/beam/blob/673da546c1465c931fdbbc5769e7d566ff55b4d8/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryServicesImpl.java#L1735-L1742
   
   It intended to set the API call timeout during split request to 30s (and 
otherwise reject split). However, a bug fix #21739 added this call here, which 
has internal retry logic, but this added call is not covered by the 
split-stream-retry config above. When there is quota issue, this is causing 
pipeline stuck 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to