+1 to Roberto's question... I'd love some more examples here too.  I looked
into writing a protocol buffer Serde a little while ago (the company I was
working for had data coming in as protobufs, and it seemed silly to convert
every piece to thrift first) and was underwhelmed by the
documentation/explanations.  FWIW, and maybe to generate a little friendly
competition, I was able to write a pig LoadFunc to load arbitrary protocol
buffers to pig tuples without much trouble...
Kevin

On Wed, Jul 8, 2009 at 4:26 PM, Roberto Congiu <[email protected]>wrote:

> Hi,I am writing a SerDe class to be able to query some proprietary format
> we have from hive.
> The format is basically a sequence of records that are maps coded in binary
> for which we have access libraries.
> The file is also gzipped.
>
> For what I understand, I need to
> 1 - write a FileInputFormat class to read the file and extract the single
> records as Writables (but I am not clear how I tell hive to use this
> fileformat since all I can use is STORED AS SEQUENCEFILE/TEXTFILE. How do
> I plug my format in there? )
> 2 - Write a SerDe (Since I just need to read it I need just the
> deserializer part) and an ObjectInspector to let hive understand how to find
> a column
>
> is there any info around for these or somebody who's done something similar
> ?
> Thanks in advance,
> Roberto
>

Reply via email to