Tim Chiang created BEAM-13962:
---------------------------------
Summary: Using GCP Dataflow Pub/Sub to BigQuery
Key: BEAM-13962
URL: https://issues.apache.org/jira/browse/BEAM-13962
Project: Beam
Issue Type: Bug
Components: io-java-gcp
Affects Versions: 2.34.0
Environment: GCP, Dataflow
Reporter: Tim Chiang
Here is the error message when I run streaming Pub/Sub to BigQuery in Dataflow
{code:java}
java.lang.RuntimeException: java.lang.NullPointerExceptionat
org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl.insertAll
( org/apache.beam.sdk.io.gcp.bigquery/BigQueryServicesImpl.java:1001 )at
org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl.insertAll
( org/apache.beam.sdk.io.gcp.bigquery/BigQueryServicesImpl.java:1054 )at
org.apache.beam.sdk.io.gcp.bigquery.BatchedStreamingWrite.flushRows (
org/apache.beam.sdk.io.gcp.bigquery/BatchedStreamingWrite.java:421 )at
org.apache.beam.sdk.io.gcp.bigquery.BatchedStreamingWrite.access$900 (
org/apache.beam.sdk.io.gcp.bigquery/BatchedStreamingWrite.java:72 )at
org.apache.beam.sdk.io.gcp.bigquery.BatchedStreamingWrite$BatchAndInsertElements.finishBundle
( org/apache.beam.sdk.io.gcp.bigquery/BatchedStreamingWrite.java:267 )Caused
by: java.lang.NullPointerExceptionat
org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl.getErrorInfo
( BigQueryServicesImpl.java:1075 )at
org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl.lambda$insertAll$1
( BigQueryServicesImpl.java:922 )at
org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$BoundedExecutorService$SemaphoreCallable.call
( BigQueryServicesImpl.java:1594 )at java.util.concurrent.FutureTask.run (
FutureTask.java:264 )at java.util.concurrent.ThreadPoolExecutor.runWorker (
ThreadPoolExecutor.java:1128 )at
java.util.concurrent.ThreadPoolExecutor$Worker.run (
ThreadPoolExecutor.java:628 )at java.lang.Thread.run ( Thread.java:834 ){code}
And here is another one.
{code:java}
Error message from worker: java.lang.RuntimeException:
com.google.api.client.googleapis.json.GoogleJsonResponseException: 400 Bad
Request
POST
https://bigquery.googleapis.com/bigquery/v2/projects/{project-id}/datasets/{dataset-id}/tables/{table-id}/insertAll?prettyPrint=false
{
"code" : 400,
"errors" : [ {
"domain" : "global",
"message" : "No rows present in the request.",
"reason" : "invalid"
} ],
"message" : "No rows present in the request.",
"status" : "INVALID_ARGUMENT"
}
org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl.insertAll(BigQueryServicesImpl.java:1001)
org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl.insertAll(BigQueryServicesImpl.java:1054)
org.apache.beam.sdk.io.gcp.bigquery.BatchedStreamingWrite.flushRows(BatchedStreamingWrite.java:421)
org.apache.beam.sdk.io.gcp.bigquery.BatchedStreamingWrite.access$900(BatchedStreamingWrite.java:72)
org.apache.beam.sdk.io.gcp.bigquery.BatchedStreamingWrite$BatchAndInsertElements.finishBundle(BatchedStreamingWrite.java:267)
Caused by: com.google.api.client.googleapis.json.GoogleJsonResponseException:
400 Bad Request
POST
https://bigquery.googleapis.com/bigquery/v2/projects/{project-id}/datasets/{dataset-id}/tables/{table-id}/insertAll?prettyPrint=false
{
"code" : 400,
"errors" : [ {
"domain" : "global",
"message" : "No rows present in the request.",
"reason" : "invalid"
} ],
"message" : "No rows present in the request.",
"status" : "INVALID_ARGUMENT"
}
com.google.api.client.googleapis.json.GoogleJsonResponseException.from(GoogleJsonResponseException.java:146)
com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:118)
com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:37)
com.google.api.client.googleapis.services.AbstractGoogleClientRequest$1.interceptResponse(AbstractGoogleClientRequest.java:428)
com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1111)
com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:514)
com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:455)
com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:565)
org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl.lambda$insertAll$1(BigQueryServicesImpl.java:910)
org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$BoundedExecutorService$SemaphoreCallable.call(BigQueryServicesImpl.java:1594)
java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
java.base/java.lang.Thread.run(Thread.java:834) {code}
I'm wondering those errors will be affected my dataflow job? What kind of
situation will happen? I just search my hold day, but I can't find out any
similar problem. Thank you all.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)