chamikaramj commented on a change in pull request #16838:
URL: https://github.com/apache/beam/pull/16838#discussion_r805086385



##########
File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryServicesImpl.java
##########
@@ -985,6 +985,23 @@ public void deleteDataset(String projectId, String 
datasetId)
           // impossible to insert into BigQuery, and so we send it out through 
the dead-letter
           // queue.
           if (nextRowSize >= maxRowBatchSize) {
+            Boolean isRetryAlways = false;
+            try {
+              // We verify whether the retryPolicy parameter is "retryAlways". 
If it is, then
+              // it will return true. Otherwise it will return false, or it 
may throw an NPE,
+              // which we need to catch and ignore.
+              isRetryAlways = retryPolicy.shouldRetry(null);
+            } catch (NullPointerException e) {
+              // We do not need to do anything about this exception, as it was 
expected;
+            }
+            if (isRetryAlways) {

Review comment:
       Instead of this, how about just letting this request go through and 
letting BQ fail ? If it fails it will fail either way and letting BQ fail will 
make sure that there's not regression.

##########
File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryServicesImpl.java
##########
@@ -985,6 +985,23 @@ public void deleteDataset(String projectId, String 
datasetId)
           // impossible to insert into BigQuery, and so we send it out through 
the dead-letter
           // queue.
           if (nextRowSize >= maxRowBatchSize) {
+            Boolean isRetryAlways = false;
+            try {
+              // We verify whether the retryPolicy parameter is "retryAlways". 
If it is, then
+              // it will return true. Otherwise it will return false, or it 
may throw an NPE,
+              // which we need to catch and ignore.
+              isRetryAlways = retryPolicy.shouldRetry(null);
+            } catch (NullPointerException e) {
+              // We do not need to do anything about this exception, as it was 
expected;

Review comment:
       When would it throw an NPE ? If so I think you are probably not using 
the spec of that method correctly.

##########
File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryServicesImpl.java
##########
@@ -985,6 +985,23 @@ public void deleteDataset(String projectId, String 
datasetId)
           // impossible to insert into BigQuery, and so we send it out through 
the dead-letter
           // queue.
           if (nextRowSize >= maxRowBatchSize) {
+            Boolean isRetryAlways = false;
+            try {
+              // We verify whether the retryPolicy parameter is "retryAlways". 
If it is, then
+              // it will return true. Otherwise it will return false, or it 
may throw an NPE,
+              // which we need to catch and ignore.
+              isRetryAlways = retryPolicy.shouldRetry(null);
+            } catch (NullPointerException e) {
+              // We do not need to do anything about this exception, as it was 
expected;
+            }
+            if (isRetryAlways) {
+              throw new RuntimeException(
+                  String.format(
+                      "We have observed a row that is %s bytes in size. 
BigQuery supports"
+                          + " request sizes up to 10MB. You can set the 
pipeline option"
+                          + " maxStreamingBatchSize to a larger number to 
unblock this pipeline.",
+                      nextRowSize));
+            }
             errorContainer.add(

Review comment:
       Now about only actively failing for rows that are larger than 10MB (BQ 
RPC limit) which would surely fail when submitted ? For most users configured 
"maxRowBatchSize" would be much less than actual BQ limit.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to