[ 
https://issues.apache.org/jira/browse/SPARK-28254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-28254:
----------------------------------
    Affects Version/s:     (was: 3.0.0)
                       3.1.0

> Use More Generic Netty Interfaces to Support Fabric Network in Spark
> --------------------------------------------------------------------
>
>                 Key: SPARK-28254
>                 URL: https://issues.apache.org/jira/browse/SPARK-28254
>             Project: Spark
>          Issue Type: New Feature
>          Components: Spark Core
>    Affects Versions: 3.1.0
>            Reporter: jiafu zhang
>            Priority: Minor
>
> Spark assumes all its networks are socket though Netty has wider interface 
> beyond socket, like Channel over SocketChannel. Socket (TCP/IP protocol) has 
> its advantage of widely being supported. Almost all types of network, 
> including fabric network, support socket. However, it's not as efficient as 
> other protocols designed for fabric, like Intel OPA which has less protocol 
> stacks and less delay than socket, not saying other more advanced features, 
> like flow control. Thus, from view of performance, it's better to have Spark 
> support other types of network protocols too. (we are also proposing Netty to 
> support more fabric semantics, like RMA (Remote Memory Access) ). For fabric 
> networks, we can use OFI (Open Fabric's Interface) 's Libfabric library to 
> have unified API to support fabric networks from different vendors. It'll 
> greatly reduce our effort. 
> To make it possible, we need to use more generic Netty interfaces in several 
> places in the network-common module. For example, use Channel instead of 
> SocketChannel in  TransportClientFactory and TransportContext class.
> Besides, we need to have more flexible options for Bootstrap and IOMode 
> class. For example in createClient method in TransportClientFactory, user can 
> pass in more options to the bootstrap instance. Currently, the options are 
> fixed for TCP/IP. 
> And in IOMode class,  we can have one more option, "OFI" for initializing OFI 
> channels based on Libfabric. Then in NettyUtils class, we can have OFI event 
> loop group, client channel class and server channel class for "OFI" IOMode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to