[ 
https://issues.apache.org/jira/browse/BEAM-12088?focusedWorklogId=576358&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-576358
 ]

ASF GitHub Bot logged work on BEAM-12088:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 02/Apr/21 23:57
            Start Date: 02/Apr/21 23:57
    Worklog Time Spent: 10m 
      Work Description: iemejia edited a comment on pull request #14417:
URL: https://github.com/apache/beam/pull/14417#issuecomment-812751458


   Thanks @ibzib I will merge this as it is in order to test it with tomorrow's 
SNAPSHOTs, We should definitely re evaluate the file staging logic in a 
subsequent PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 576358)
    Time Spent: 1h 20m  (was: 1h 10m)

> Make file staging uniform among Spark Runners
> ---------------------------------------------
>
>                 Key: BEAM-12088
>                 URL: https://issues.apache.org/jira/browse/BEAM-12088
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-spark
>            Reporter: Ismaël Mejía
>            Assignee: Ismaël Mejía
>            Priority: P2
>             Fix For: 2.30.0
>
>          Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Both the Spark Classic and Portable runners share the file staging logic, but 
> the Structured Streaming runner is using a different logic even if the 
> process should in principle be the same. This manifests on issues when trying 
> to deploy pipelines via Hadoop YARN with exceptions related to file Staging 
> of non existent files.
> Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: 
> /tmp/beamTempLocation/86b9809c0719d64ac39af2beb29c73ff71cb8264add4f1bc01f27fefddf6f5c1.jar
>  (No such file or directory)
>         at 
> org.apache.beam.runners.core.construction.resources.PipelineResources.zipDirectory(PipelineResources.java:120)
>         at 
> org.apache.beam.runners.core.construction.resources.PipelineResources.packageDirectoriesToStage(PipelineResources.java:94)
>         at 
> org.apache.beam.runners.core.construction.resources.PipelineResources.lambda$prepareFilesForStaging$1(PipelineResources.java:85)
>         at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>         at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>         at 
> java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1384)
>         at 
> java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
>         at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
>         at 
> java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
>         at 
> java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
>         at 
> java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:566)
>         at 
> org.apache.beam.runners.core.construction.resources.PipelineResources.prepareFilesForStaging(PipelineResources.java:88)
>         at 
> org.apache.beam.runners.spark.structuredstreaming.translation.PipelineTranslator.prepareFilesToStageForRemoteClusterExecution(PipelineTranslator.java:61)
>         at 
> org.apache.beam.runners.spark.structuredstreaming.SparkStructuredStreamingRunner.translatePipeline(SparkStructuredStreamingRunner.java:199)
>         at 
> org.apache.beam.runners.spark.structuredstreaming.SparkStructuredStreamingRunner.run(SparkStructuredStreamingRunner.java:153)
>         at 
> org.apache.beam.runners.spark.structuredstreaming.SparkStructuredStreamingRunner.run(SparkStructuredStreamingRunner.java:70)
>         at org.apache.beam.sdk.Pipeline.run(Pipeline.java:322)
>         at org.apache.beam.sdk.Pipeline.run(Pipeline.java:308)
>         at 
> com.talend.tpcds.beam.Query3ViaBeamSdkGenericRecord.main(Query3ViaBeamSdkGenericRecord.java:240)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at 
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:685)
> Caused by: java.io.FileNotFoundException: 
> /tmp/beamTempLocation/86b9809c0719d64ac39af2beb29c73ff71cb8264add4f1bc01f27fefddf6f5c1.jar
>  (No such file or directory)
>         at java.io.FileOutputStream.open0(Native Method)
>         at java.io.FileOutputStream.open(FileOutputStream.java:270)
>         at java.io.FileOutputStream.<init>(FileOutputStream.java:213)
>         at java.io.FileOutputStream.<init>(FileOutputStream.java:101)
>         at 
> org.apache.beam.runners.core.construction.resources.PipelineResources.zipDirectory(PipelineResources.java:118)
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to