[
https://issues.apache.org/jira/browse/HADOOP-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12629739#action_12629739
]
Alex Loddengaard commented on HADOOP-3788:
------------------------------------------
A quick update: I applied Tom's patch from HADOOP-3787 to a fresh trunk build
and looked at the _InputStream_ given to _ThriftDeserialization_. This stream
has trailing binary data as well, that very closely resembles the trailing
binary I saw in my PB example. Here is the output from _ThriftSerializer_:
Again, the strange characters below are a result of this being copy-pasted from
_less_.
{noformat}
[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@^A^@
{noformat}
Here's the _InputStream_ given to _ThriftDeserializer_
{noformat}
[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL
PROTECTED],[EMAIL PROTECTED]@^D3^AH
{noformat}
As far as I can tell, this find proves that PBs are not compatible with
Hadoop's current implementation. Can someone verify this, please, and also
recommend possible next steps towards compatibility? In the meantime I'll dig
into Hadoop more. Thanks!
> Add serialization for Protocol Buffers
> --------------------------------------
>
> Key: HADOOP-3788
> URL: https://issues.apache.org/jira/browse/HADOOP-3788
> Project: Hadoop Core
> Issue Type: Wish
> Components: examples, mapred
> Affects Versions: 0.19.0
> Reporter: Tom White
> Assignee: Alex Loddengaard
> Fix For: 0.19.0
>
> Attachments: hadoop-3788-v1.patch, protobuf-java-2.0.1.jar
>
>
> Protocol Buffers (http://code.google.com/p/protobuf/) are a way of encoding
> data in a compact binary format. This issue is to write a
> ProtocolBuffersSerialization to support using Protocol Buffers types in
> MapReduce programs, including an example program. This should probably go
> into contrib.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.