[
https://issues.apache.org/jira/browse/BEAM-4286?focusedWorklogId=102277&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-102277
]
ASF GitHub Bot logged work on BEAM-4286:
----------------------------------------
Author: ASF GitHub Bot
Created on: 15/May/18 20:44
Start Date: 15/May/18 20:44
Worklog Time Spent: 10m
Work Description: bsidhom commented on a change in pull request #5359:
[BEAM-4286] Implement pooled artifact source
URL: https://github.com/apache/beam/pull/5359#discussion_r188379997
##########
File path:
runners/flink/src/main/java/org/apache/beam/runners/flink/ArtifactSourcePool.java
##########
@@ -39,23 +44,134 @@
@ThreadSafe
public class ArtifactSourcePool implements ArtifactSource {
+ public static ArtifactSourcePool create() {
Review comment:
That's correct. However, we can still use the DistributedCache for local
runs (although it's not strictly necessary since the standard FS is available
to all workers). I think it's also important to have such a utility because
Flink _will_ eventually support real distributed caches.
This will hopefully happen sooner than we thought;
https://github.com/apache/flink/pull/5580 has already been merged.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 102277)
Time Spent: 1h 10m (was: 1h)
> Pooled artifact source
> ----------------------
>
> Key: BEAM-4286
> URL: https://issues.apache.org/jira/browse/BEAM-4286
> Project: Beam
> Issue Type: Bug
> Components: runner-flink
> Reporter: Ben Sidhom
> Assignee: Ben Sidhom
> Priority: Minor
> Time Spent: 1h 10m
> Remaining Estimate: 0h
>
> Because DistributeCache lifetimes are tied to operator lifetimes in Flink, we
> need a way to wrap operator-scoped artifact sources. Artifacts are inherently
> job-scoped and should be the same throughout a job's lifetime. For this
> reason, it is safe to pool artifact sources and serve artifacts from an
> arbitrary pooled source as long as the underlying source is still in scope.
> We need a pooled source in order to satisfy the bundle factory interfaces.
> Using the job-scoped and stage-scoped bundle factories allows us to cache and
> reuse different components that serve SDK harnesses. Because the distributed
> cache lifetimes are specific to Flink, the pooled artifact source should
> probably live in a runner-specific directory.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)