[ 
https://issues.apache.org/jira/browse/HADOOP-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12630237#action_12630237
 ] 

Doug Cutting commented on HADOOP-3788:
--------------------------------------

> I worry about defining the contract for deserializers so that the end of the 
> stream marks the end of the object being read 

That's certainly not the contract in effect today . One possible advantage of 
such a contract is that, if the framework knows the length, it can communicate 
it in this way to the object, so that lengths need not be stored twice.  For 
example, SequenceFile stores the lengths of keys and values, and, if keys and 
values are Text, we store their lenght again in the key and value.  But reading 
until EOF seems a poor way of communicating this.  Perhaps we could change the 
deserialize API to be optionally passed a length or somesuch.


> Add serialization for Protocol Buffers
> --------------------------------------
>
>                 Key: HADOOP-3788
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3788
>             Project: Hadoop Core
>          Issue Type: Wish
>          Components: examples, mapred
>    Affects Versions: 0.19.0
>            Reporter: Tom White
>            Assignee: Alex Loddengaard
>             Fix For: 0.19.0
>
>         Attachments: hadoop-3788-v1.patch, hadoop-3788-v2.patch, 
> protobuf-java-2.0.1.jar
>
>
> Protocol Buffers (http://code.google.com/p/protobuf/) are a way of encoding 
> data in a compact binary format. This issue is to write a 
> ProtocolBuffersSerialization to support using Protocol Buffers types in 
> MapReduce programs, including an example program. This should probably go 
> into contrib. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to