[ 
https://issues.apache.org/jira/browse/HADOOP-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12632229#action_12632229
 ] 

Tom White commented on HADOOP-3788:
-----------------------------------

bq. the OutputStream when serializing would need meta data included in it

I don't think we want to invent a new format here - this issue is to make 
serialization work with existing formats, such as SequenceFile (or the new 
TFile, or HADOOP-4065).

As an experiment, I modified PBDeserializer to have a deserialize method that 
takes a length (+in+ is now a CodedInputStream):

{code}
  public T deserialize(T t, int length) throws IOException {
    t = (t == null) ? (T) newInstance() : t;
    
    int limit = in.pushLimit(length);
    Message result =
      t.newBuilderForType().mergeFrom(in).build();
    in.popLimit(limit);
    
    return (T) result;
  }
{code}

I then modified TestPBSerializationIsolated to serialize two strings to the 
stream. When using the deserialize method that doesn't take a length the test 
failed, but when I passed the length the test succeeded.

So, I think we can do this without modifying Protocol Buffers. The change 
needed is the new method on Deserializer (and Serializer?) that takes a length, 
and then changes in the framework to call the new method when appropriate.

> Add serialization for Protocol Buffers
> --------------------------------------
>
>                 Key: HADOOP-3788
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3788
>             Project: Hadoop Core
>          Issue Type: Wish
>          Components: examples, mapred
>    Affects Versions: 0.19.0
>            Reporter: Tom White
>            Assignee: Alex Loddengaard
>             Fix For: 0.19.0
>
>         Attachments: hadoop-3788-v1.patch, hadoop-3788-v2.patch, 
> protobuf-java-2.0.1.jar
>
>
> Protocol Buffers (http://code.google.com/p/protobuf/) are a way of encoding 
> data in a compact binary format. This issue is to write a 
> ProtocolBuffersSerialization to support using Protocol Buffers types in 
> MapReduce programs, including an example program. This should probably go 
> into contrib. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to