Hi Guys,

I wanted to pass the Protocol Buffer generated serialized file
directly to map reduce. Currently I have the following data flow :

PB's client (Serialize files)-> Send on Network -> PB's server
( Deserialize file)-> copying to Hadoop FS -> Feeding deserialized
file to map-reduce job.

Here for client and server I have implemented RPC. But since I think
that serializing file is an overhead. So is there any way so that
serialized file directly fed to map reduce and work can be done.

I want the data flow like this :

PB's client (Serialize files)-> Send on Network ->copying to Hadoop FS
-> Feeding serialized file to map-reduce job.

I found out that Apache Avro provide this feature. Does protocol
buffer support this scenario too?


