[
https://issues.apache.org/jira/browse/FLINK-22672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17391525#comment-17391525
]
Jin Xing commented on FLINK-22672:
----------------------------------
Resolve this umbrella for Release-1.14. Unfinished tickets is moved to
FLINK-23586 for following tracking.
> Some enhancements for pluggable shuffle service framework
> ---------------------------------------------------------
>
> Key: FLINK-22672
> URL: https://issues.apache.org/jira/browse/FLINK-22672
> Project: Flink
> Issue Type: Improvement
> Components: Runtime / Network
> Reporter: Jin Xing
> Assignee: Jin Xing
> Priority: Major
> Fix For: 1.14.0
>
>
> "Pluggable shuffle service" in Flink provides an architecture which are
> unified for both streaming and batch jobs, allowing user to customize the
> process of data transfer between shuffle stages according to scenarios.
> There are already a number of implementations of "remote shuffle service" on
> Spark like [1][2][3]. Remote shuffle enables to shuffle data from/to a remote
> cluster and achieves benefits like :
> # The lifecycle of computing resource can be decoupled with shuffle data,
> once computing task is finished, idle computing nodes can be released with
> its completed shuffle data accommodated on remote shuffle cluster.
> # There is no need to reserve disk capacity for shuffle on computing nodes.
> Remote shuffle cluster serves shuffling request with better scaling ability
> and alleviates the local disk pressure on computing nodes when data skew.
> Based on "pluggable shuffle service", we build our own "remote shuffle
> service" on Flink –- Lattice, which targets to provide functionalities and
> improve performance for batch processing jobs. Basically it works as below:
> # Lattice cluster works as an independent service for shuffling request;
> # LatticeShuffleMaster extends ShuffleMaster, works inside JM and talks with
> remote Lattice cluster for shuffle resource application and shuffle data
> lifecycle management;
> # LatticeShuffleEnvironment extends ShuffleEnvironment, works inside TM and
> provides an environment for shuffling data from/to remote Lattice cluster;
> During the process of building Lattice we find some potential enhancements on
> "pluggable shuffle service". I will enumerate and create some sub JIRAs under
> this umbrella
>
> [1]
> [https://www.alibabacloud.com/blog/emr-remote-shuffle-service-a-powerful-elastic-tool-of-serverless-spark_597728]
> [2] [https://bestoreo.github.io/post/cosco/cosco/]
> [3] [https://github.com/uber/RemoteShuffleService]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)