Hi,

I am working on a custom Pig source code that writes RDF data into text
files. I was looking to instead *write to an ORCFile* for some of the
columnar benefits it offers.

I understand that I need to use *HCatalog APIs*. I have an idea on how to
create HCatSchema for my data. And that I would need to use the
HCatOutputFormat for writing into ORCFile.

I need some help on *how to specify the storage format as ORCFile.* I see
that ORC has built-in support. But I cannot find any examples as to how to
specify which output format the HCatalog APIs can write to (default Hive
table or RCFile or ORCFile or Sequence File etc..).

I would then need to work on reading from these ORCFiles and reconstruct
the records.

Any pointers would be appreciated. Thanks in advance.

Regards,
Abhishek

Reply via email to