GitHub user aarondav opened a pull request:
https://github.com/apache/spark/pull/3146
[SPARK-4187] [Core] Switch to binary protocol for external shuffle service
messages
This PR elimiantes the network package's usage of the Java serializer and
replaces it with Encodable, which is a lightweight binary protocol. Each
message is preceded by a type id, which will allow us to change messages (by
only adding new ones), or to change the format entirely by switching to a
special id (such as -1).
This protocol has the advantage over Java that we can guarantee that
messages will remain compatible across compiled versions and JVMs, though it
does not provide a clean way to do schema migration. In the future, it may be
good to use a more heavy-weight serialization format like protobuf, thrift, or
avro, but these all add several dependencies which are unnecessary at the
present time.
Additionally this unifies the RPC messages of NettyBlockTransferService and
ExternalShuffleClient.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/aarondav/spark free
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/3146.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #3146
----
commit 538f2a3e1cf2b25a3e0d1853238b37f7a2b556de
Author: Aaron Davidson <[email protected]>
Date: 2014-11-07T01:20:59Z
[SPARK-4187] [Core] Switch to binary protocol for external shuffle service
messages
This PR elimiantes the network package's usage of the Java serializer and
replaces it with Encodable, which is a lightweight binary protocol. Each
message is preceded by a type id, which will allow us to change messages (by
only adding new ones), or to change the format entirely by switching to a
special id (such as -1).
This protocol has the advantage over Java that we can guarantee that
messages will remain compatible across compiled versions and JVMs, though it
does not provide a clean way to do schema migration. In the future, it may be
good to use a more heavy-weight serialization format like protobuf, thrift, or
avro, but these all add several dependencies which are unnecessary at the
present time.
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]