Hi all,
I was wondering if anyone is using Hive with protocol buffers.  The
Hadoop wiki links to
http://www.slideshare.net/ragho/hive-user-meeting-august-2009-facebook
for SerDe examples; there it says that there is no built-in support
for protobufs.  Since this presentation is about a year old, I was
wondering whether there appeared any UDFs, native or third-party, to
deal with them.

I am also curious about the relative efficiency of performing SerDe
using UDFs in hive vs. running a separate hadoop job to first
deserialize the data from protocol buffers into an ascii flat file
with only the "interesting" fields (going from ~15 fields to ~3), and
then doing the rest of the computation in hive.  I'd appreciate any
comments!

Thanks,
--Leo

Reply via email to