[ 
https://issues.apache.org/jira/browse/STORM-127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rick Kellogg updated STORM-127:
-------------------------------
    Component/s: storm-core

> Implement protocol buffer encoding for shell spouts and bolts
> -------------------------------------------------------------
>
>                 Key: STORM-127
>                 URL: https://issues.apache.org/jira/browse/STORM-127
>             Project: Apache Storm
>          Issue Type: Improvement
>          Components: storm-core
>            Reporter: James Xu
>            Priority: Minor
>             Fix For: 0.9.2-incubating
>
>
> https://github.com/nathanmarz/storm/issues/654
> The current multilang protocol using json encoding is pretty slow. I plan to 
> add the feature to shell spouts and bolts to use protocol buffer encoding.
> I've completed the design from the non-JVM language side and the protocol 
> buffer side. I would just like some feedback from the storm community on how 
> to integrate this feature into the codebase.
> Should the feature be fully backwards compatible?
> Should there be two types of shell spouts and bolts (json and protobuf)?
> Or, I think the better and more generic solution: a shell bolt takes in an 
> interface for decoding and encoding (or an encoding interface and a decoding 
> interface). There then exists a json and protobuf interface implementation 
> and the user selects which one(s) to plug in.
> ----------
> nathanmarz: I think making the serialization interface pluggable is fine, and 
> it should be configurable via the topology config. Please open a pull request 
> for that.
> The best place for the protobuf implementation would be under the 
> @stormprocessor account.
> ----------
> jsgilmore: I've done some refactoring of the Storm shell components. I moved 
> ShellProcess, ShellSpout and ShellBolt to backtype.storm.multilang, since the 
> refactoring also created other classes. I think it makes more sense to have 
> all multilang code live together.
> The new design overloads the ShellSpout and ShellBolt constructors with a 
> ISerializer argument. The ShellProcess contains a serializer field, which it 
> can use to send and receive objects. By default, a ShellComponent uses a 
> JsonSerializer, which was mostly factored out of the ShellProcess code.
> Instead of ShellComponents working directly with JSON objects, I created the 
> following abstract data types: Emission, SpoutMsg and Immission (Maybe 
> another name would be better, but this is technically correct). These objects 
> are written and read from the serializer interface. The interface 
> implementation can then use any wire protocol to get data in and out of those 
> objects.
> I would appreciate comments on the design. I shall then submit a pull request.
> I've also implemented a protocol buffer serializer that that can substitute 
> the JSON serializer. This serializer uses a binary varint delimited wire 
> protocol to serialise the protocol buffer messages. I can add this to the 
> @stormprocessor account.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to