[ https://issues.apache.org/jira/browse/PIG-833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12745770#action_12745770 ]
He Yongqiang commented on PIG-833: ---------------------------------- Thanks Jing. Yeah, i know the design of column groups and projection. The reason i was asking is that i saw an usage in line 251 TestBasicTable.java: {noformat} doReadWrite(path, 2, 100, "SF_a,SF_b,SF_c,SF_d,SF_e", "[SF_a,SF_b,SF_c];[SF_d,SF_e]", "SF_f,SF_a,SF_c,SF_d", true, false); {noformat} where "SF_f,SF_a,SF_c,SF_d" is passed as projection, but is there a column "SF_f" defined? btw, can you give more detail about the design of Partition? ColumnGroup is much like projection in C-Store, so it can be more easily to be understood. > Storage access layer > -------------------- > > Key: PIG-833 > URL: https://issues.apache.org/jira/browse/PIG-833 > Project: Pig > Issue Type: New Feature > Reporter: Jay Tang > Attachments: hadoop20.jar.bz2, PIG-833-zebra.patch, > PIG-833-zebra.patch.bz2, PIG-833-zebra.patch.bz2, > TEST-org.apache.hadoop.zebra.pig.TestCheckin1.txt, test.out, zebra-javadoc.tgz > > > A layer is needed to provide a high level data access abstraction and a > tabular view of data in Hadoop, and could free Pig users from implementing > their own data storage/retrieval code. This layer should also include a > columnar storage format in order to provide fast data projection, > CPU/space-efficient data serialization, and a schema language to manage > physical storage metadata. Eventually it could also support predicate > pushdown for further performance improvement. Initially, this layer could be > a contrib project in Pig and become a hadoop subproject later on. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.