Great, very interested! This looks like it would also be useful for scenarios 
where shuffle write throughput drops after sortsplit (we encountered in 
production: when a job's shuffle with multiple partitions triggers sortsplit 
and revives, the push data traffic becomes unbalanced among worker nodes, some 
hotspot workers get throttled, ultimately causing the entire shuffle write 
throughput to decrease).

2025年6月29日 22:32,rexxiong <rexxi...@apache.org> 写道:

Thanks to Erik for the proposal.

In fact, before Erik introduced this feature to the community, we had
already discussed this idea together, and Erik's team implemented it
internally. Later, we integrated this optimization into our production
environment, and I must say it has significantly improved performance in
skew scenarios. It not only enhances shuffle write efficiency notably but
also improves cluster resource utilization, preventing overload on a few
nodes.

Additionally, there's a small issue to note: CIP-18 has already been used,
you can use CIP-20 for this.


Regards,
Jiashu Xiong


Erik fang <fme...@gmail.com> 于2025年6月27日周五 19:17写道:

Hi community,

I'd like to start a discuss about CIP-18: Dynamically optimize shuffle
write parallelism

This proposal aims to enable Celeborn to write to multiple
PartitionLocations for a single partition concurrently, which significantly
improves skew partition performance

Please let me know if you have any comments or questions

link:

https://docs.google.com/document/d/1CqFswIOP5nR8Cy2THo8tELOwVxUf0ZP_8pDjaYj2HGc/edit?usp=sharing

Regards,
Erik


Reply via email to