Hong Tang commented on PIG-833:
Jeff, just like the SQL effort, the space of columnar storage is also wide
open, and I think it is more beneficial to the overall healthy of the hadoop
With that being said, I also looked at the patch attached with HIVE-352. It
appears that what the patch does is a level below our stated objectives.
Specifically, the guts of the implementation (RCFile) is very close in spirit
to TFile as described HADOOP-3315, which seems to have its first comprehensive
patch back in December 2008.
> Storage access layer
> Key: PIG-833
> URL: https://issues.apache.org/jira/browse/PIG-833
> Project: Pig
> Issue Type: New Feature
> Reporter: Jay Tang
> A layer is needed to provide a high level data access abstraction and a
> tabular view of data in Hadoop, and could free Pig users from implementing
> their own data storage/retrieval code. This layer should also include a
> columnar storage format in order to provide fast data projection,
> CPU/space-efficient data serialization, and a schema language to manage
> physical storage metadata. Eventually it could also support predicate
> pushdown for further performance improvement. Initially, this layer could be
> a contrib project in Pig and become a hadoop subproject later on.
This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.