Marcelo Vanzin created SPARK-12007:
--------------------------------------
Summary: Network library's RPC layer requires a lot of copying
Key: SPARK-12007
URL: https://issues.apache.org/jira/browse/SPARK-12007
Project: Spark
Issue Type: Improvement
Components: Spark Core
Reporter: Marcelo Vanzin
The network library's RPC layer has an external API based on byte arrays,
instead of ByteBuffer; that requires a lot of copying since the internals of
the library use ByteBuffers (or rather Netty's ByteBuf), and lots of external
clients also use ByteBuffer.
The extra copies could be avoided if the API used ByteBuffer instead.
To show an extreme case, look at an RPC send via NettyRpcEnv:
- message is encoded using JavaSerializer, resulting in a ByteBuffer
- the ByteBuffer is copied into a byte array of the right size, since its
internal array may be larger than the actual data it holds
- the network library's encoder copies the byte array into a ByteBuf
- finally the data is written to the socket
The intermediate 2 copies could be avoided if the API allowed the original
ByteBuffer to be sent instead.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]