Amarendra Kumar created BEAM-5434:
-------------------------------------
Summary: Issue with BigQueryIO in Template
Key: BEAM-5434
URL: https://issues.apache.org/jira/browse/BEAM-5434
Project: Beam
Issue Type: Bug
Components: sdk-java-core
Affects Versions: 2.5.0
Reporter: Amarendra Kumar
Assignee: Kenneth Knowles
I am trying to build a google Dataflow template to be run from a cloud function.
The issue is with BigQueryIO trying execute a SQL.
The opening step for my Dataflow Template is
{code:java}
BigQueryIO.readTableRows().withQueryLocation("US").withoutValidation().fromQuery(options.getSql()).usingStandardSql()
{code}
When the template is triggered for the first time its running fine.
But when its triggered for the second time, it fails with the following error.
{code}
// Some comments here
java.io.FileNotFoundException: No files matched spec:
gs://temp-test-notification/temp/MoEngageNotification/BigQueryExtractTemp/34d42a122600416c9ea748a6e325f87a/000000000000.avro
at
org.apache.beam.sdk.io.FileSystems.maybeAdjustEmptyMatchResult(FileSystems.java:172)
at org.apache.beam.sdk.io.FileSystems.match(FileSystems.java:158)
at
org.apache.beam.sdk.io.FileBasedSource.createReader(FileBasedSource.java:329)
at
com.google.cloud.dataflow.worker.WorkerCustomSources$1.iterator(WorkerCustomSources.java:360)
at
com.google.cloud.dataflow.worker.util.common.worker.ReadOperation.runReadLoop(ReadOperation.java:177)
at
com.google.cloud.dataflow.worker.util.common.worker.ReadOperation.start(ReadOperation.java:158)
at
com.google.cloud.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:75)
at
com.google.cloud.dataflow.worker.BatchDataflowWorker.executeWork(BatchDataflowWorker.java:391)
at
com.google.cloud.dataflow.worker.BatchDataflowWorker.doWork(BatchDataflowWorker.java:360)
at
com.google.cloud.dataflow.worker.BatchDataflowWorker.getAndPerformWork(BatchDataflowWorker.java:288)
at
com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.doWork(DataflowBatchWorkerHarness.java:134)
at
com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:114)
at
com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:101)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{code}
Could you please let me know if I am missing something or this is a bug?
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)