Great, very interested! This looks like it would also be useful for scenarios where shuffle write throughput drops after sortsplit (we encountered in production: when a job's shuffle with multiple partitions triggers sortsplit and revives, the push data traffic becomes unbalanced among worker nodes, some hotspot workers get throttled, ultimately causing the entire shuffle write throughput to decrease).
2025年6月29日 22:32,rexxiong <rexxi...@apache.org> 写道: Thanks to Erik for the proposal. In fact, before Erik introduced this feature to the community, we had already discussed this idea together, and Erik's team implemented it internally. Later, we integrated this optimization into our production environment, and I must say it has significantly improved performance in skew scenarios. It not only enhances shuffle write efficiency notably but also improves cluster resource utilization, preventing overload on a few nodes. Additionally, there's a small issue to note: CIP-18 has already been used, you can use CIP-20 for this. Regards, Jiashu Xiong Erik fang <fme...@gmail.com> 于2025年6月27日周五 19:17写道: Hi community, I'd like to start a discuss about CIP-18: Dynamically optimize shuffle write parallelism This proposal aims to enable Celeborn to write to multiple PartitionLocations for a single partition concurrently, which significantly improves skew partition performance Please let me know if you have any comments or questions link: https://docs.google.com/document/d/1CqFswIOP5nR8Cy2THo8tELOwVxUf0ZP_8pDjaYj2HGc/edit?usp=sharing Regards, Erik