[jira] [Commented] (FLINK-27862) FLIP-235: Hybrid Shuffle Mode
[ https://issues.apache.org/jira/browse/FLINK-27862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17553406#comment-17553406 ] Aitozi commented on FLINK-27862: Hi [~xtsong] , I have an offline discussion with [~Weijie Guo]. And I will try to start parallel work from ticket3: Introduce HsDataBuffer. Can you help assign the ticket ? > FLIP-235: Hybrid Shuffle Mode > - > > Key: FLINK-27862 > URL: https://issues.apache.org/jira/browse/FLINK-27862 > Project: Flink > Issue Type: New Feature > Components: Runtime / Network >Reporter: Weijie Guo >Assignee: Weijie Guo >Priority: Major > Labels: Umbrella > > Introduce a new shuffle mode can overcome some of the problems of Pipelined > Shuffle and Blocking Shuffle in batch scenarios, it can make best use of > available resources and minimize disk IO load. > More details see > [FLIP-235|https://cwiki.apache.org/confluence/display/FLINK/FLIP-235%3A+Hybrid+Shuffle+Mode] > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (FLINK-27862) FLIP-235: Hybrid Shuffle Mode
[ https://issues.apache.org/jira/browse/FLINK-27862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17550832#comment-17550832 ] Aitozi commented on FLINK-27862: Thanks [~Weijie Guo] , [~xtsong] for your kindness guide, I will take a look at the PoC work first and will reach you out for further discussion. > FLIP-235: Hybrid Shuffle Mode > - > > Key: FLINK-27862 > URL: https://issues.apache.org/jira/browse/FLINK-27862 > Project: Flink > Issue Type: New Feature > Components: Runtime / Network >Reporter: Weijie Guo >Assignee: Weijie Guo >Priority: Major > Labels: Umbrella > > Introduce a new shuffle mode can overcome some of the problems of Pipelined > Shuffle and Blocking Shuffle in batch scenarios, it can make best use of > available resources and minimize disk IO load. > More details see > [FLIP-235|https://cwiki.apache.org/confluence/display/FLINK/FLIP-235%3A+Hybrid+Shuffle+Mode] > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (FLINK-27862) FLIP-235: Hybrid Shuffle Mode
[ https://issues.apache.org/jira/browse/FLINK-27862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17550822#comment-17550822 ] Xintong Song commented on FLINK-27862: -- Hi [~aitozi], Thanks for offering. I see several ways that you may help. * Reviewing the PRs will be definitely appreciated. * You may also help with transforming the PoC implementation into PRs, which involves some design changes w.r.t. the FLIP as well as improving the code quality and adding test cases. For this part if you want, you may first take a look at the PoC codes, and we can set up a call discussing how the workload can be split. I believe there are some tasks that can be worked on in parallel. > FLIP-235: Hybrid Shuffle Mode > - > > Key: FLINK-27862 > URL: https://issues.apache.org/jira/browse/FLINK-27862 > Project: Flink > Issue Type: New Feature > Components: Runtime / Network >Reporter: Weijie Guo >Assignee: Weijie Guo >Priority: Major > Labels: Umbrella > > Introduce a new shuffle mode can overcome some of the problems of Pipelined > Shuffle and Blocking Shuffle in batch scenarios, it can make best use of > available resources and minimize disk IO load. > More details see > [FLIP-235|https://cwiki.apache.org/confluence/display/FLINK/FLIP-235%3A+Hybrid+Shuffle+Mode] > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (FLINK-27862) FLIP-235: Hybrid Shuffle Mode
[ https://issues.apache.org/jira/browse/FLINK-27862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17550819#comment-17550819 ] Weijie Guo commented on FLINK-27862: Hi [~aitozi], Thank you very much for your attention, welcome to participate in it. Let me share the current situation of this FLIP: 1、We already have a POC version in-house with some level of testing. 2、The implementation of this POC version is not exactly the same as the design in FLIP-235. For example, the spill strategy adopts all data write to disk strategy instead of selective spill strategy, etc. 3、In order to verify if there is a conflict in merging the code into the open source Flink version, I pushed the code to a branch on my own ([github repository|https://github.com/reswqa/flink/tree/hs-merge-from-vvr]). Since part of the code is going to be discarded in the new design, it is not pick into the test branch, so this branch cannot actually run. But it already contains the core implementation of our POC version. 4、If you have any other questions, you are very welcome to communicate with me offline. > FLIP-235: Hybrid Shuffle Mode > - > > Key: FLINK-27862 > URL: https://issues.apache.org/jira/browse/FLINK-27862 > Project: Flink > Issue Type: New Feature > Components: Runtime / Network >Reporter: Weijie Guo >Assignee: Weijie Guo >Priority: Major > Labels: Umbrella > > Introduce a new shuffle mode can overcome some of the problems of Pipelined > Shuffle and Blocking Shuffle in batch scenarios, it can make best use of > available resources and minimize disk IO load. > More details see > [FLIP-235|https://cwiki.apache.org/confluence/display/FLINK/FLIP-235%3A+Hybrid+Shuffle+Mode] > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (FLINK-27862) FLIP-235: Hybrid Shuffle Mode
[ https://issues.apache.org/jira/browse/FLINK-27862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17550507#comment-17550507 ] Aitozi commented on FLINK-27862: Hi [~Weijie Guo] Thanks for starting this work. I'm interested in this flip. Can I join this work and take some simple work as start > FLIP-235: Hybrid Shuffle Mode > - > > Key: FLINK-27862 > URL: https://issues.apache.org/jira/browse/FLINK-27862 > Project: Flink > Issue Type: New Feature > Components: Runtime / Network >Reporter: Weijie Guo >Assignee: Weijie Guo >Priority: Major > Labels: Umbrella > > Introduce a new shuffle mode can overcome some of the problems of Pipelined > Shuffle and Blocking Shuffle in batch scenarios, it can make best use of > available resources and minimize disk IO load. > More details see > [FLIP-235|https://cwiki.apache.org/confluence/display/FLINK/FLIP-235%3A+Hybrid+Shuffle+Mode] > -- This message was sent by Atlassian Jira (v8.20.7#820007)