[
https://issues.apache.org/jira/browse/PIG-3308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Marcin Czech updated PIG-3308:
------------------------------
Status: Patch Available (was: Open)
> Storing data in hive columnar rc format
> ---------------------------------------
>
> Key: PIG-3308
> URL: https://issues.apache.org/jira/browse/PIG-3308
> Project: Pig
> Issue Type: Improvement
> Components: piggybank
> Affects Versions: 0.10.1
> Reporter: Marcin Czech
> Labels: patch
> Fix For: 0.10.1
>
> Attachments: PIG-3308.patch
>
>
> I've coded HiveColumnarStorage that can store Pig structures as a Hive
> Columnar RC tables. Code is based on Elephant-bird RCFilePigStorage. The
> difference is that data are stored in Hive friendly format, so file can be
> read from Hive.
> Example Pig schema:
> {code}
> f1:tuple (f11: chararray,f12: chararray),f2:map[]
> {code}
> Hive schema:
> {code}
> CREATE TABLE sample_table (f1 struct<f11:string,f12:string>, f2
> array<struct<f21:string,f22:string>>)
> PARTITIONED BY (p string)
> STORED AS RCFILE
> {code}
> or as a:
> {code}
> CREATE TABLE sample_table (f1 struct<f11:string,f12:string>, f2 MAP
> <string,string>)
> PARTITIONED BY (p string)
> STORED AS RCFILE
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira