[
https://issues.apache.org/jira/browse/STORM-1220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15015237#comment-15015237
]
ASF GitHub Bot commented on STORM-1220:
---------------------------------------
Github user arunmahadevan commented on a diff in the pull request:
https://github.com/apache/storm/pull/894#discussion_r45436425
--- Diff: storm-core/src/jvm/backtype/storm/spout/RawScheme.java ---
@@ -18,11 +18,13 @@
package backtype.storm.spout;
import backtype.storm.tuple.Fields;
+
+import java.nio.ByteBuffer;
import java.util.List;
import static backtype.storm.utils.Utils.tuple;
public class RawScheme implements Scheme {
- public List<Object> deserialize(byte[] ser) {
+ public List<Object> deserialize(ByteBuffer ser) {
--- End diff --
@harshach Actually I was referring to the receiver (the bolts) that might
be currently doing something like `byte[] bytes =
inputTuple.getBinaryByField("bytes");` to get the data emitted from kafka
spout. It appears that the `deserialize` method returns a tuple that wraps the
input ByteBuffer which gets emitted.
> Avoid double copying in the Kafka spout
> ---------------------------------------
>
> Key: STORM-1220
> URL: https://issues.apache.org/jira/browse/STORM-1220
> Project: Apache Storm
> Issue Type: Bug
> Reporter: Haohui Mai
> Assignee: Haohui Mai
>
> Currently the kafka spout takes a {{ByteBuffer}} from Kafka. However, the
> serialization scheme takes a {{byte[]}} array as input. Therefore the current
> implementation copies the {{ByteBuffer}} to a new {{byte[]}} array in order
> to hook everything together.
> This jira proposes to changes the interfaces of serialization scheme to avoid
> copying the data twice in the spout.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)