GitHub user aarondav opened a pull request:

    https://github.com/apache/spark/pull/3146

    [SPARK-4187] [Core] Switch to binary protocol for external shuffle service 
messages

    This PR elimiantes the network package's usage of the Java serializer and 
replaces it with Encodable, which is a lightweight binary protocol. Each 
message is preceded by a type id, which will allow us to change messages (by 
only adding new ones), or to change the format entirely by switching to a 
special id (such as -1).
    
    This protocol has the advantage over Java that we can guarantee that 
messages will remain compatible across compiled versions and JVMs, though it 
does not provide a clean way to do schema migration. In the future, it may be 
good to use a more heavy-weight serialization format like protobuf, thrift, or 
avro, but these all add several dependencies which are unnecessary at the 
present time.
    
    Additionally this unifies the RPC messages of NettyBlockTransferService and 
ExternalShuffleClient.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/aarondav/spark free

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/3146.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3146
    
----
commit 538f2a3e1cf2b25a3e0d1853238b37f7a2b556de
Author: Aaron Davidson <[email protected]>
Date:   2014-11-07T01:20:59Z

    [SPARK-4187] [Core] Switch to binary protocol for external shuffle service 
messages
    
    This PR elimiantes the network package's usage of the Java serializer and 
replaces it with Encodable, which is a lightweight binary protocol. Each 
message is preceded by a type id, which will allow us to change messages (by 
only adding new ones), or to change the format entirely by switching to a 
special id (such as -1).
    
    This protocol has the advantage over Java that we can guarantee that 
messages will remain compatible across compiled versions and JVMs, though it 
does not provide a clean way to do schema migration. In the future, it may be 
good to use a more heavy-weight serialization format like protobuf, thrift, or 
avro, but these all add several dependencies which are unnecessary at the 
present time.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to