[ 
https://issues.apache.org/jira/browse/BEAM-5434?focusedWorklogId=175850&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-175850
 ]

ASF GitHub Bot logged work on BEAM-5434:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 16/Dec/18 21:13
            Start Date: 16/Dec/18 21:13
    Worklog Time Spent: 10m 
      Work Description: stale[bot] commented on issue #6457: [BEAM-5434] 
Improve error handling in the artifact staging service
URL: https://github.com/apache/beam/pull/6457#issuecomment-447676392
 
 
   This pull request has been marked as stale due to 60 days of inactivity. It 
will be closed in 1 week if no further activity occurs. If you think that’s 
incorrect or this pull request requires a review, please simply write any 
comment. If closed, you can revive the PR at any time and @mention a reviewer 
or discuss it on the [email protected] list. Thank you for your 
contributions.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 175850)
    Time Spent: 1h  (was: 50m)

> Issue with BigQueryIO in Template
> ---------------------------------
>
>                 Key: BEAM-5434
>                 URL: https://issues.apache.org/jira/browse/BEAM-5434
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-java-core
>    Affects Versions: 2.5.0
>            Reporter: Amarendra Kumar
>            Assignee: Chamikara Jayalath
>            Priority: Blocker
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> I am trying to build a google Dataflow template to be run from a cloud 
> function.
> The issue is with BigQueryIO trying execute a SQL.
> The opening step for my Dataflow Template is
> {code:java}
> BigQueryIO.readTableRows().withQueryLocation("US").withoutValidation().fromQuery(options.getSql()).usingStandardSql()
> {code}
> When the template is triggered for the first time its running fine.
> But when its triggered for the second time, it fails with the following error.
> {code}
> // Some comments here
> java.io.FileNotFoundException: No files matched spec: 
> gs://test-notification/temp/Notification/BigQueryExtractTemp/34d42a122600416c9ea748a6e325f87a/000000000000.avro
>       at 
> org.apache.beam.sdk.io.FileSystems.maybeAdjustEmptyMatchResult(FileSystems.java:172)
>       at org.apache.beam.sdk.io.FileSystems.match(FileSystems.java:158)
>       at 
> org.apache.beam.sdk.io.FileBasedSource.createReader(FileBasedSource.java:329)
>       at 
> com.google.cloud.dataflow.worker.WorkerCustomSources$1.iterator(WorkerCustomSources.java:360)
>       at 
> com.google.cloud.dataflow.worker.util.common.worker.ReadOperation.runReadLoop(ReadOperation.java:177)
>       at 
> com.google.cloud.dataflow.worker.util.common.worker.ReadOperation.start(ReadOperation.java:158)
>       at 
> com.google.cloud.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:75)
>       at 
> com.google.cloud.dataflow.worker.BatchDataflowWorker.executeWork(BatchDataflowWorker.java:391)
>       at 
> com.google.cloud.dataflow.worker.BatchDataflowWorker.doWork(BatchDataflowWorker.java:360)
>       at 
> com.google.cloud.dataflow.worker.BatchDataflowWorker.getAndPerformWork(BatchDataflowWorker.java:288)
>       at 
> com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.doWork(DataflowBatchWorkerHarness.java:134)
>       at 
> com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:114)
>       at 
> com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:101)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at java.lang.Thread.run(Thread.java:745)
> {code}
> In the second run, why is the process expecting a file in the GCS location?
> This file does get created while the job is running at the first run, but it 
> also gets deleted after the job is complete. 
> How are the two jobs related?
>  Could you please let me know if I am missing something or this is a bug?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to