eolivelli commented on PR #3604: URL: https://github.com/apache/celeborn/pull/3604#issuecomment-3925763630
> @eolivelli, thanks for improvement of `S3MultipartUploadHandler`. The changes generally look good. Could you provide the testing report of the performance with this improvement? I added some flamegraphs in the PR description. I don't have numbers at hand, but we are talking about reducing the time from minutes to seconds, and in particular we reduce the number of calls to S3: AssumeRole and ListObject. In AWS S3 you pay for those calls, and Celeborn usually creates lots of slots for each job (default number of shuffle partitions is 200 in Spark) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
