Marcelo Vanzin created SPARK-12007:
--------------------------------------

             Summary: Network library's RPC layer requires a lot of copying
                 Key: SPARK-12007
                 URL: https://issues.apache.org/jira/browse/SPARK-12007
             Project: Spark
          Issue Type: Improvement
          Components: Spark Core
            Reporter: Marcelo Vanzin


The network library's RPC layer has an external API based on byte arrays, 
instead of ByteBuffer; that requires a lot of copying since the internals of 
the library use ByteBuffers (or rather Netty's ByteBuf), and lots of external 
clients also use ByteBuffer.

The extra copies could be avoided if the API used ByteBuffer instead.

To show an extreme case, look at an RPC send via NettyRpcEnv:

- message is encoded using JavaSerializer, resulting in a ByteBuffer
- the ByteBuffer is copied into a byte array of the right size, since its 
internal array may be larger than the actual data it holds
- the network library's encoder copies the byte array into a ByteBuf
- finally the data is written to the socket

The intermediate 2 copies could be avoided if the API allowed the original 
ByteBuffer to be sent instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to