Re: Avro, a cross-language serialization framework from Doug Cutting, proposed as Hadoop subproject

Bryan Duxbury Fri, 03 Apr 2009 09:30:21 -0700

His primary use case is the same as Hadoop recordio < big fileswith lots ofsimilar records in them. So he wants to be able to put a datadescriptionheader and then have a big stream of records that conform to thatheader and
which don¹t need type fields interspersed.


Don't we call this TDenseProtocol?

In addition, he wants to have it all to work dynamically so, forexample, aPython script used in Hadoop Streaming can read the header and pullfieldsout of records in the stream without needing to have the generatedbindings.

This is something we don't have, but would be trivial to add. It'd beweird to pass the out-of-band header communications in the samestream as the data when using Hadoop streaming though, so I'm notsure that's going to be such a no-brainer.

Is someone going to get in touch with Doug directly, or are we justgoing to try and jump on his mailing list thread?


-Bryan

Re: Avro, a cross-language serialization framework from Doug Cutting, proposed as Hadoop subproject

Reply via email to