gauravkm commented on PR #3062: URL: https://github.com/apache/celeborn/pull/3062#issuecomment-2818678151
@gaoyajun02 Thanks for reviewing. * I hope you don't mind watching https://issues.apache.org/jira/browse/CELEBORN-894 for the latest updates. Ethan and Jiashu have been interacting with me on the JIRA and the design doc * After [v2 updates](https://docs.google.com/document/d/1YqK0kua-5rMufJw57kEIrHHGbLnAF9iXM5GdDweMzzg/edit?tab=t.0) to the design doc, there was feedback highlighting gaps related to the new implementation of skewed partition handling in Celeborn and how the current design won't work in that scenario * Last week was spent updating the design to [v3](https://docs.google.com/document/d/1YqK0kua-5rMufJw57kEIrHHGbLnAF9iXM5GdDweMzzg/edit?tab=t.0#heading=h.vw2m85p95521), that can handle the new implementation of skew partition handling as well. However, it does involve some tradeoffs. * The v3 design simplifies v2 quite a bit and so to minimize churn in code changes, I decided to first have design alignment before I updated the new patch. Could you please review as well? On the whole, it consolidates the interactions and I expect fewer code changes than the current patch. * Regarding split PR's for review purposes, I do plan to split it into multiple PRs. The new patch is 2000 lines of code and so it makes a lot of sense to split As next steps: 1. Let's align on the design. Could you please review [v3 design](https://docs.google.com/document/d/1YqK0kua-5rMufJw57kEIrHHGbLnAF9iXM5GdDweMzzg/edit?tab=t.0#heading=h.vw2m85p95521) proposal as well? 2. In parallel this week, I am working on updating the current implementation with the v3 changes. I want to test the updated implementation with both skew and non-skew scenarios to confirm that it works as expected in all scenarios 3. I can publish a more detailed split PR proposal as well later this week. As of now, the rough pieces are Driver changes, Shuffle Writer changes, Reader changes. I plan to hide the new code paths behind a flag. Thanks for your continued patience and engaging on this important Celeborn feature! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
