[
https://issues.apache.org/jira/browse/BEAM-8971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17045123#comment-17045123
]
Sushil Kumar commented on BEAM-8971:
------------------------------------
We are facing the same error in version `2.16.0`.
Other behaviour that I'm not really sure of when this exception occurs is if
the batch gets retried or discarded.
Since timeout is a transient error and we are using
{code:java}
.withFailedInsertRetryPolicy(InsertRetryPolicy.retryTransientErrors()){code}
I'm assuming the batch gets retried. This can cause duplicate events in
downstream applications.
{code:java}
Error message from worker: java.lang.RuntimeException: java.io.IOException:
Insert failed:
[{"errors":[{"debugInfo":"","location":"","message":"","reason":"timeout"}],"index":0}]
org.apache.beam.sdk.io.gcp.bigquery.StreamingWriteFn.flushRows(StreamingWriteFn.java:151)
org.apache.beam.sdk.io.gcp.bigquery.StreamingWriteFn.finishBundle(StreamingWriteFn.java:112)
Caused by: java.io.IOException: Insert failed:
[{"errors":[{"debugInfo":"","location":"","message":"","reason":"timeout"}],"index":0}]
org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl.insertAll(BigQueryServicesImpl.java:854)
org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl.insertAll(BigQueryServicesImpl.java:871)
org.apache.beam.sdk.io.gcp.bigquery.StreamingWriteFn.flushRows(StreamingWriteFn.java:140)
org.apache.beam.sdk.io.gcp.bigquery.StreamingWriteFn.finishBundle(StreamingWriteFn.java:112)
org.apache.beam.sdk.io.gcp.bigquery.StreamingWriteFn$DoFnInvoker.invokeFinishBundle(Unknown
Source)
org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner.finishBundle(SimpleDoFnRunner.java:228)
org.apache.beam.runners.dataflow.worker.SimpleParDoFn.finishBundle(SimpleParDoFn.java:417)
org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoOperation.finish(ParDoOperation.java:56)
org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:85)
org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:1316)
org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.access$1000(StreamingDataflowWorker.java:149)
org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker$6.run(StreamingDataflowWorker.java:1049)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.run(Thread.java:745){code}
> BigQueryIO.Write sometimes throws errors
> -----------------------------------------
>
> Key: BEAM-8971
> URL: https://issues.apache.org/jira/browse/BEAM-8971
> Project: Beam
> Issue Type: Bug
> Components: io-java-gcp
> Affects Versions: 2.15.0
> Reporter: Pavlo Pohrrebnyi
> Priority: Minor
>
> The following error happens from time to time. After that beam retries an
> entire batch and that gets processed fine. There are 2 concerns:
> * that may produce duplicates (however, I am not sure)
> * these might be false-positive errors which clutter the log and produce
> false alerts
> Stacktrace:
> java.lang.RuntimeException: java.io.IOException: Insert failed:
> [\{"errors":[{"debugInfo":"","location":"","message":"","reason":"timeout"}],"index":0}]
> at
> org.apache.beam.sdk.io.gcp.bigquery.StreamingWriteFn.flushRows(StreamingWriteFn.java:151)
> at
> org.apache.beam.sdk.io.gcp.bigquery.StreamingWriteFn.finishBundle(StreamingWriteFn.java:112)
> Caused by: java.io.IOException: Insert failed:
> [\{"errors":[{"debugInfo":"","location":"","message":"","reason":"timeout"}],"index":0}]
> at
> org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl.insertAll(BigQueryServicesImpl.java:854)
> at
> org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl.insertAll(BigQueryServicesImpl.java:871)
> at
> org.apache.beam.sdk.io.gcp.bigquery.StreamingWriteFn.flushRows(StreamingWriteFn.java:140)
> at
> org.apache.beam.sdk.io.gcp.bigquery.StreamingWriteFn.finishBundle(StreamingWriteFn.java:112)
> at
> org.apache.beam.sdk.io.gcp.bigquery.StreamingWriteFn$DoFnInvoker.invokeFinishBundle(Unknown
> Source)
> at
> org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner.finishBundle(SimpleDoFnRunner.java:224)
> at
> org.apache.beam.runners.dataflow.worker.SimpleParDoFn.finishBundle(SimpleParDoFn.java:412)
> at
> org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoOperation.finish(ParDoOperation.java:56)
> at
> org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:85)
> at
> org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:1295)
> at
> org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.access$1000(StreamingDataflowWorker.java:149)
> at
> org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker$6.run(StreamingDataflowWorker.java:1028)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)