[
https://issues.apache.org/jira/browse/STORM-127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rick Kellogg updated STORM-127:
-------------------------------
Component/s: storm-core
> Implement protocol buffer encoding for shell spouts and bolts
> -------------------------------------------------------------
>
> Key: STORM-127
> URL: https://issues.apache.org/jira/browse/STORM-127
> Project: Apache Storm
> Issue Type: Improvement
> Components: storm-core
> Reporter: James Xu
> Priority: Minor
> Fix For: 0.9.2-incubating
>
>
> https://github.com/nathanmarz/storm/issues/654
> The current multilang protocol using json encoding is pretty slow. I plan to
> add the feature to shell spouts and bolts to use protocol buffer encoding.
> I've completed the design from the non-JVM language side and the protocol
> buffer side. I would just like some feedback from the storm community on how
> to integrate this feature into the codebase.
> Should the feature be fully backwards compatible?
> Should there be two types of shell spouts and bolts (json and protobuf)?
> Or, I think the better and more generic solution: a shell bolt takes in an
> interface for decoding and encoding (or an encoding interface and a decoding
> interface). There then exists a json and protobuf interface implementation
> and the user selects which one(s) to plug in.
> ----------
> nathanmarz: I think making the serialization interface pluggable is fine, and
> it should be configurable via the topology config. Please open a pull request
> for that.
> The best place for the protobuf implementation would be under the
> @stormprocessor account.
> ----------
> jsgilmore: I've done some refactoring of the Storm shell components. I moved
> ShellProcess, ShellSpout and ShellBolt to backtype.storm.multilang, since the
> refactoring also created other classes. I think it makes more sense to have
> all multilang code live together.
> The new design overloads the ShellSpout and ShellBolt constructors with a
> ISerializer argument. The ShellProcess contains a serializer field, which it
> can use to send and receive objects. By default, a ShellComponent uses a
> JsonSerializer, which was mostly factored out of the ShellProcess code.
> Instead of ShellComponents working directly with JSON objects, I created the
> following abstract data types: Emission, SpoutMsg and Immission (Maybe
> another name would be better, but this is technically correct). These objects
> are written and read from the serializer interface. The interface
> implementation can then use any wire protocol to get data in and out of those
> objects.
> I would appreciate comments on the design. I shall then submit a pull request.
> I've also implemented a protocol buffer serializer that that can substitute
> the JSON serializer. This serializer uses a binary varint delimited wire
> protocol to serialise the protocol buffer messages. I can add this to the
> @stormprocessor account.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)