[
https://issues.apache.org/jira/browse/BEAM-4063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16459219#comment-16459219
]
Thomas Weise commented on BEAM-4063:
------------------------------------
Flink 1.5 is already frozen, so the above PR isn't going to make it. Since most
environments already have the external storage system dependency (HDFS, S3
etc.), shouldn't the default implementation for now use the distributed cache
and also allow the user to replace it with their own service?
> Flink runner supports cluster-wide artifact deployments through the
> Distributed Cache
> -------------------------------------------------------------------------------------
>
> Key: BEAM-4063
> URL: https://issues.apache.org/jira/browse/BEAM-4063
> Project: Beam
> Issue Type: New Feature
> Components: runner-flink
> Reporter: Ben Sidhom
> Priority: Minor
>
> As of now, Flink effectively has a dependency on an external storage system
> for artifact management. This is because the Flink Distributed Cache does not
> actually distribute and cache blobs itself, but rather expects that each node
> in a running cluster has access to a well-known artifact resource.
> We should get this for free whenever
> [https://github.com/apache/flink/pull/5580] is merged (likely in 1.5). For
> now, we will have to defer to external storage systems like GCS or HDFS.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)