Yuval Degani created SPARK-22229:
------------------------------------
Summary: SPIP: RDMA Accelerated Shuffle Engine
Key: SPARK-22229
URL: https://issues.apache.org/jira/browse/SPARK-22229
Project: Spark
Issue Type: Improvement
Components: Spark Core
Affects Versions: 2.3.0
Reporter: Yuval Degani
An RDMA-accelerated shuffle engine can provide enormous performance benefits to
shuffle-intensive Spark jobs, as demonstrated in the “SparkRDMA” plugin
open-source project ([https://github.com/Mellanox/SparkRDMA]).
Using RDMA for shuffle improves CPU utilization significantly and reduces I/O
processing overhead by bypassing the kernel and networking stack as well as
avoiding memory copies entirely. Those valuable CPU cycles are then consumed
directly by the actual Spark workloads, and help reducing the job runtime
significantly.
This performance gain is demonstrated with both industry standard HiBench
TeraSort (shows 1.5x speedup in sorting) as well as shuffle intensive customer
applications.
SparkRDMA will be presented at Spark Summit 2017 in Dublin
([https://spark-summit.org/eu-2017/events/accelerating-shuffle-a-tailor-made-rdma-solution-for-apache-spark/])
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]