[jira] [Updated] (FLINK-13246) Implement external shuffle service for Kubernetes

MalcolmSanders (JIRA) Fri, 12 Jul 2019 04:35:08 -0700


     [ 
https://issues.apache.org/jira/browse/FLINK-13246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


MalcolmSanders updated FLINK-13246:
-----------------------------------
    Description: 
Flink batch job users could achieve better cluster utilization and job 
throughput throught external shuffle service because the producers of 
intermedia result partitions can be released once intermedia result partitions 
have been persisted on disks. In 
[FLINK-10653|https://issues.apache.org/jira/browse/FLINK-10653], [~zjwang] has 
introduced pluggable shuffle manager architecture which abstracts the process 
of data transfer between stages from flink runtime as shuffle service. I 
propose to k8s implementation for flink external shuffle service.

There are a few points needed to be discussed:
(1) how to deploy external shuffle service in k8s?
DaemonSet Vs. Sidecar mode
(2) how to manage pv used for storing intermedia result partition data?
Plan A: Shuffle servers(or other volume provisioners) provision pv, and 
producers write to local pv;
Plan B: Producers write to shuffle server through network, and let shuffle 
server control the use of pv;
(3) shuffle server could temporarily apply persistent storage backed by cloud 
storages such as AWSElasticBlockStore, cephFs and etc.

I'll bring a design document later.

  was:
Flink batch job users could achieve better cluster utilization and job 
throughput throught external shuffle service because the producers of 
intermedia result partitions can be released once intermedia result partitions 
have been persisted on disks. In 
[FLINK-10653|https://issues.apache.org/jira/browse/FLINK-10653], [~zjwang] has 
introduced pluggable shuffle manager architecture which abstracts the process 
of data transfer between stages from flink runtime as shuffle service. I 
propose to k8s implementation for flink external shuffle service.

There are a few points needed to be discussed:
(1) how to deploy external shuffle service in k8s?
DaemonSet Vs. Sidecar mode
(2) how to manage pv used for storing intermedia result partition data?
Plan A: Shuffle servers(or other volume provisioners) provision pv, and 
producers write to local pv;
Plan B: Producers write to shuffle server through network, and let shuffle 
server control the use of pv;

I'll bring a design document later.


> Implement external shuffle service for Kubernetes
> -------------------------------------------------
>
>                 Key: FLINK-13246
>                 URL: https://issues.apache.org/jira/browse/FLINK-13246
>             Project: Flink
>          Issue Type: New Feature
>          Components: Runtime / Network
>            Reporter: MalcolmSanders
>            Assignee: MalcolmSanders
>            Priority: Minor
>
> Flink batch job users could achieve better cluster utilization and job 
> throughput throught external shuffle service because the producers of 
> intermedia result partitions can be released once intermedia result 
> partitions have been persisted on disks. In 
> [FLINK-10653|https://issues.apache.org/jira/browse/FLINK-10653], [~zjwang] 
> has introduced pluggable shuffle manager architecture which abstracts the 
> process of data transfer between stages from flink runtime as shuffle 
> service. I propose to k8s implementation for flink external shuffle service.
> There are a few points needed to be discussed:
> (1) how to deploy external shuffle service in k8s?
> DaemonSet Vs. Sidecar mode
> (2) how to manage pv used for storing intermedia result partition data?
> Plan A: Shuffle servers(or other volume provisioners) provision pv, and 
> producers write to local pv;
> Plan B: Producers write to shuffle server through network, and let shuffle 
> server control the use of pv;
> (3) shuffle server could temporarily apply persistent storage backed by cloud 
> storages such as AWSElasticBlockStore, cephFs and etc.
> I'll bring a design document later.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (FLINK-13246) Implement external shuffle service for Kubernetes

Reply via email to