[ https://issues.apache.org/jira/browse/PIG-993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12761766#action_12761766 ]
Raghu Angadi commented on PIG-993: ---------------------------------- API is pretty simple : {code} class org.apache.hadoop.zebra.BasicTable { /** see the patch for JavaDoc and attached example for usage */ public static void dropColumnGroup(Path path, Configuration conf, String cgName) throws IOException { ... } } {code} * Table schema is not modified. * this API takes a name for a column group. PIG-986 adds explicit names for CGs. * Once a CGs is deleted, NULL is returned for the fields that were stored in the CG. ** This is the main difference between just manually deleting a directory on filesystem and 'properly' deleting a CG. ** Many changes made in other parts of zebra are related to handling the missing CGs. > [zebra] Abitlity to drop a column group in a table > -------------------------------------------------- > > Key: PIG-993 > URL: https://issues.apache.org/jira/browse/PIG-993 > Project: Pig > Issue Type: Bug > Reporter: Raghu Angadi > Assignee: Raghu Angadi > Fix For: 0.5.0 > > > A Zebra table is stored as multiple sub tables each containing a set of > columns called column group (CG). The user specifies how these columns are > grouped while creating a table through the _storage hint_. > For some of the large tables, it might be necessary for users to remove a set > of columns and retain the rest. This jira provides a way for users to delete > an entire column group. > The following comments will have more details on API and the semantics. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.