Min Shen created SPARK-36423:
--------------------------------
Summary: Randomize blocks within a push request before pushing to
improve block merge ratio
Key: SPARK-36423
URL: https://issues.apache.org/jira/browse/SPARK-36423
Project: Spark
Issue Type: Sub-task
Components: Shuffle, Spark Core
Affects Versions: 3.2.0
Reporter: Min Shen
On the client side, we are currently randomizing the order of push requests
before processing each request. In addition we canĀ further randomize the order
of blocks within each push request before pushing them.
In our benchmark, this has resulted in a 60%-70% reduction of blocks that fail
to be merged due to bock collision (the existing block merge ratio is already
pretty good in general, and this further improves it).
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]