zuston opened a new pull request, #2669: URL: https://github.com/apache/uniffle/pull/2669
### What changes were proposed in this pull request? This is the part-1 PR only with uniffle client changes of making the partition stats stored in the shuffle-server side to make the integrity validation mechanism more stable. BTW, the shuffle-servers side changes will be implemented in the further PRs. ### Why are the changes needed? By leveraging the PR #2653 , we could end-to-end ensure the data consistency. But, the partition stats stored in the spark driver side, for the normal spark stages, this design runs well. But with the 100000 tasks with 10000 partitions, this will make the Spark driver overload. From the point of cluster spark jobs, some huge jobs will hang when getting the blockManagerIds, that will cost almost 20mins for one reader task, that is unacceptable. And so, this PR implements the server side store the partition stats like the blockID store did. ### Does this PR introduce _any_ user-facing change? `spark.rss.client.integrityValidation.serverManagementEnabled=false` ### How was this patch tested? Internal job tests. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
