[
https://issues.apache.org/jira/browse/SPARK-28254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dongjoon Hyun updated SPARK-28254:
----------------------------------
Affects Version/s: (was: 3.0.0)
3.1.0
> Use More Generic Netty Interfaces to Support Fabric Network in Spark
> --------------------------------------------------------------------
>
> Key: SPARK-28254
> URL: https://issues.apache.org/jira/browse/SPARK-28254
> Project: Spark
> Issue Type: New Feature
> Components: Spark Core
> Affects Versions: 3.1.0
> Reporter: jiafu zhang
> Priority: Minor
>
> Spark assumes all its networks are socket though Netty has wider interface
> beyond socket, like Channel over SocketChannel. Socket (TCP/IP protocol) has
> its advantage of widely being supported. Almost all types of network,
> including fabric network, support socket. However, it's not as efficient as
> other protocols designed for fabric, like Intel OPA which has less protocol
> stacks and less delay than socket, not saying other more advanced features,
> like flow control. Thus, from view of performance, it's better to have Spark
> support other types of network protocols too. (we are also proposing Netty to
> support more fabric semantics, like RMA (Remote Memory Access) ). For fabric
> networks, we can use OFI (Open Fabric's Interface) 's Libfabric library to
> have unified API to support fabric networks from different vendors. It'll
> greatly reduce our effort.
> To make it possible, we need to use more generic Netty interfaces in several
> places in the network-common module. For example, use Channel instead of
> SocketChannel in TransportClientFactory and TransportContext class.
> Besides, we need to have more flexible options for Bootstrap and IOMode
> class. For example in createClient method in TransportClientFactory, user can
> pass in more options to the bootstrap instance. Currently, the options are
> fixed for TCP/IP.
> And in IOMode class, we can have one more option, "OFI" for initializing OFI
> channels based on Libfabric. Then in NettyUtils class, we can have OFI event
> loop group, client channel class and server channel class for "OFI" IOMode.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]