HyukjinKwon opened a new pull request, #38613: URL: https://github.com/apache/spark/pull/38613
### What changes were proposed in this pull request? This PR is a followup of https://github.com/apache/spark/pull/38468 that proposes to remove notify-wait approach, and introduce a new way to collect partitions in parallel, and send them in order. - Previously, it actually waits until all results are stored all first, and then send them one by one in Protobuf message; (therefore, notify-wait isn't needed in fact). Both worse and best cases, we will always collect all partitions first and send them partition by partition. - Now, it sends Protobuf messages in an order whenever 0th partition is available (and send the next if available). Worse case, we will collect all partitions and send them one by one. Best case is to send partition by partition as it's collected. ### Why are the changes needed? For better performance, less memory usage, and better readability and maintinability (by removing synchronization) ### Does this PR introduce _any_ user-facing change? No, this feature is not released yet, and this is performance only fix. ### How was this patch tested? CI in this PR should test it out. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
