[ 
https://issues.apache.org/jira/browse/BEAM-8312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kyle Weaver updated BEAM-8312:
------------------------------
    Description: 
Currently, Flink job jars re-stage all artifacts at runtime (on the JobManager) 
by using the usual BeamFileSystemArtifactRetrievalService [1]. However, since 
the manifest and all the artifacts live on the classpath of the jar, and 
everything from the classpath is copied to the Flink workers anyway [2], it 
should not be necessary to do additional artifact staging. We could replace 
BeamFileSystemArtifactRetrievalService in this case with a simple 
ArtifactRetrievalService that just pulls the artifacts from the classpath.

 

 [1] 
[https://github.com/apache/beam/blob/340c3202b1e5824b959f5f9f626e4c7c7842a3cb/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactRetrievalService.java]

[2] 
[https://github.com/apache/beam/blob/2f1b56ccc506054e40afe4793a8b556e872e1865/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkExecutionEnvironments.java#L93]

  was:
Currently, Flink job jars stage all artifacts by using the usual 
BeamFileSystemArtifactRetrievalService [1]. However, since the manifest and all 
the artifacts live on the classpath of the jar, and everything from the 
classpath is copied to the Flink workers anyway, it should not be necessary to 
do additional artifact staging. We could replace 
BeamFileSystemArtifactRetrievalService in this case with a simple 
ArtifactRetrievalService that just pulls the artifacts from the classpath.

 

 [1] 
[https://github.com/apache/beam/blob/340c3202b1e5824b959f5f9f626e4c7c7842a3cb/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactRetrievalService.java]


> Flink portable pipeline jars do not need to stage artifacts remotely
> --------------------------------------------------------------------
>
>                 Key: BEAM-8312
>                 URL: https://issues.apache.org/jira/browse/BEAM-8312
>             Project: Beam
>          Issue Type: Improvement
>          Components: runner-flink
>            Reporter: Kyle Weaver
>            Assignee: Kyle Weaver
>            Priority: Major
>              Labels: portability-flink
>
> Currently, Flink job jars re-stage all artifacts at runtime (on the 
> JobManager) by using the usual BeamFileSystemArtifactRetrievalService [1]. 
> However, since the manifest and all the artifacts live on the classpath of 
> the jar, and everything from the classpath is copied to the Flink workers 
> anyway [2], it should not be necessary to do additional artifact staging. We 
> could replace BeamFileSystemArtifactRetrievalService in this case with a 
> simple ArtifactRetrievalService that just pulls the artifacts from the 
> classpath.
>  
>  [1] 
> [https://github.com/apache/beam/blob/340c3202b1e5824b959f5f9f626e4c7c7842a3cb/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactRetrievalService.java]
> [2] 
> [https://github.com/apache/beam/blob/2f1b56ccc506054e40afe4793a8b556e872e1865/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkExecutionEnvironments.java#L93]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to