[
https://issues.apache.org/jira/browse/HADOOP-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12632229#action_12632229
]
Tom White commented on HADOOP-3788:
-----------------------------------
bq. the OutputStream when serializing would need meta data included in it
I don't think we want to invent a new format here - this issue is to make
serialization work with existing formats, such as SequenceFile (or the new
TFile, or HADOOP-4065).
As an experiment, I modified PBDeserializer to have a deserialize method that
takes a length (+in+ is now a CodedInputStream):
{code}
public T deserialize(T t, int length) throws IOException {
t = (t == null) ? (T) newInstance() : t;
int limit = in.pushLimit(length);
Message result =
t.newBuilderForType().mergeFrom(in).build();
in.popLimit(limit);
return (T) result;
}
{code}
I then modified TestPBSerializationIsolated to serialize two strings to the
stream. When using the deserialize method that doesn't take a length the test
failed, but when I passed the length the test succeeded.
So, I think we can do this without modifying Protocol Buffers. The change
needed is the new method on Deserializer (and Serializer?) that takes a length,
and then changes in the framework to call the new method when appropriate.
> Add serialization for Protocol Buffers
> --------------------------------------
>
> Key: HADOOP-3788
> URL: https://issues.apache.org/jira/browse/HADOOP-3788
> Project: Hadoop Core
> Issue Type: Wish
> Components: examples, mapred
> Affects Versions: 0.19.0
> Reporter: Tom White
> Assignee: Alex Loddengaard
> Fix For: 0.19.0
>
> Attachments: hadoop-3788-v1.patch, hadoop-3788-v2.patch,
> protobuf-java-2.0.1.jar
>
>
> Protocol Buffers (http://code.google.com/p/protobuf/) are a way of encoding
> data in a compact binary format. This issue is to write a
> ProtocolBuffersSerialization to support using Protocol Buffers types in
> MapReduce programs, including an example program. This should probably go
> into contrib.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.