[
https://issues.apache.org/jira/browse/FLINK-13246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Flink Jira Bot updated FLINK-13246:
-----------------------------------
Labels: stale-assigned stale-minor (was: stale-minor)
> Implement external shuffle service for Kubernetes
> -------------------------------------------------
>
> Key: FLINK-13246
> URL: https://issues.apache.org/jira/browse/FLINK-13246
> Project: Flink
> Issue Type: New Feature
> Components: Runtime / Network
> Reporter: MalcolmSanders
> Assignee: MalcolmSanders
> Priority: Minor
> Labels: stale-assigned, stale-minor
>
> Flink batch job users could achieve better cluster utilization and job
> throughput throught external shuffle service because the producers of
> intermedia result partitions can be released once intermedia result
> partitions have been persisted on disks. In
> [FLINK-10653|https://issues.apache.org/jira/browse/FLINK-10653], [~zjwang]
> has introduced pluggable shuffle manager architecture which abstracts the
> process of data transfer between stages from flink runtime as shuffle
> service. I propose to k8s implementation for flink external shuffle service.
> There are a few points needed to be discussed:
> (1) how to deploy external shuffle service in k8s?
> DaemonSet Vs. Sidecar mode
> (2) how to manage pv used for storing intermedia result partition data?
> Plan A: Shuffle servers(or other volume provisioners) provision pv, and
> producers write to local pv;
> Plan B: Producers write to shuffle server through network, and let shuffle
> server control the use of pv;
> (3) shuffle server could temporarily apply persistent storage backed by cloud
> storages such as AWSElasticBlockStore, cephFs and etc.
> I'll bring a design document later.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)