[
https://issues.apache.org/jira/browse/FLINK-11805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Flink Jira Bot closed FLINK-11805.
----------------------------------
Resolution: Auto Closed
This issue was labeled "stale-minor" 7 ago and has not received any updates so
I have gone ahead and closed it. If you are still affected by this or would
like to raise the priority of this ticket please re-open, removing the label
"auto-closed" and raise the ticket priority accordingly.
> A Common External Shuffle Service Framework
> -------------------------------------------
>
> Key: FLINK-11805
> URL: https://issues.apache.org/jira/browse/FLINK-11805
> Project: Flink
> Issue Type: New Feature
> Components: Runtime / Network
> Reporter: MalcolmSanders
> Assignee: MalcolmSanders
> Priority: Minor
> Labels: auto-closed, stale-assigned
>
> In [FLINK-10653|https://issues.apache.org/jira/browse/FLINK-10653], [~zjwang]
> has introduced pluggable shuffle manager architecture which abstracts the
> process of data transfer between stages from flink runtime as shuffle
> service. Here I'd like to propose a common external shuffle service framework
> so that a majority of external shuffle services could be achieved more easily
> by compositing and wrapping this framework as well as implementing a few
> interfaces according to the specific platform or deployment system.
> As far as I'm concerned, a common external shuffle service scenario:
> (1) a shuffle service daemon process runs on each host machine as a server to
> provide shuffle data for remote(maybe local) consumers.
> (2) a producer gets a local persistent output directory for writing shuffle
> data from the shuffle service daemon process of current host machine, and
> writes shuffle data afterwards.
> (3) a consumer fetch its subpartition data from the shuffle service daemon on
> the host machine where the partition locates.
> In my point of view, such framework could be applicable to external shuffle
> services such as YarnShuffleService, KubernetesShuffleService and
> StandaloneShuffleService. As to KubernetesShuffleService, there is also
> another plan, named as sidecar mode, to achieve shuffle service on k8s which
> puts a TM process and a shuffle service process into a pod. As a result,
> there might be multiple shuffle service daemons running on a host machine, it
> can still fit into the framework since the only difference might be whether
> the port of each shuffle service process is fixed or not. Accroding to
> [~zjwang]'s
> [proposal|https://cwiki.apache.org/confluence/display/FLINK/FLIP-31%3A+Pluggable+Shuffle+Manager],
> this case can be handled via UpdatePartitionInfo so that the actual port of
> each shuffle service process can be updated to the consumers.
> This framework is not intended to handle external shuffle services which use
> global storages as the media for shuffle data, such as DfsShuffleService, or
> other implementations which don't request an actual shuffle service role such
> as RdmaShuffleService.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)