[jira] Commented: (PIG-1375) [Zebra] To support writing multiple Zebra tables through Pig
[ https://issues.apache.org/jira/browse/PIG-1375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12858754#action_12858754 ] Hadoop QA commented on PIG-1375: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12442232/PIG-1375.patch against trunk revision 935101. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 9 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/295/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/295/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/295/console This message is automatically generated. [Zebra] To support writing multiple Zebra tables through Pig Key: PIG-1375 URL: https://issues.apache.org/jira/browse/PIG-1375 Project: Pig Issue Type: New Feature Affects Versions: 0.7.0 Reporter: Chao Wang Assignee: Chao Wang Fix For: 0.8.0 Attachments: PIG-1375.patch, PIG-1375.patch, PIG-1375.patch In Zebra, we already have multiple outputs support for map/reduce. But we do not support this feature if users use Zebra through Pig. This jira is to address this issue. We plan to support writing to multiple output tables through Pig as well. We propose to support the following Pig store statements with multiple outputs: store relation into 'loc1,loc2,loc3' using org.apache.hadoop.zebra.pig.TableStorer('storagehint_string', 'complete name of your custom partition class', 'some arguments to partition class'); /* if certain partition class arguments is needed */ store relation into 'loc1,loc2,loc3' using org.apache.hadoop.zebra.pig.TableStorer('storagehint_string', 'complete name of your custom partition class'); /* if no partition class arguments is needed */ Note that users need to specify up to three arguments - storage hint string, complete name of partition class and partition class arguments string. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1375) [Zebra] To support writing multiple Zebra tables through Pig
[ https://issues.apache.org/jira/browse/PIG-1375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12858058#action_12858058 ] Hadoop QA commented on PIG-1375: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12441981/PIG-1375.patch against trunk revision 935046. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 9 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/294/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/294/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/294/console This message is automatically generated. [Zebra] To support writing multiple Zebra tables through Pig Key: PIG-1375 URL: https://issues.apache.org/jira/browse/PIG-1375 Project: Pig Issue Type: New Feature Affects Versions: 0.7.0 Reporter: Chao Wang Assignee: Chao Wang Fix For: 0.8.0 Attachments: PIG-1375.patch, PIG-1375.patch In Zebra, we already have multiple outputs support for map/reduce. But we do not support this feature if users use Zebra through Pig. This jira is to address this issue. We plan to support writing to multiple output tables through Pig as well. We propose to support the following Pig store statements with multiple outputs: store relation into 'loc1,loc2,loc3' using org.apache.hadoop.zebra.pig.TableStorer('storagehint_string', 'complete name of your custom partition class', 'some arguments to partition class'); /* if certain partition class arguments is needed */ store relation into 'loc1,loc2,loc3' using org.apache.hadoop.zebra.pig.TableStorer('storagehint_string', 'complete name of your custom partition class'); /* if no partition class arguments is needed */ Note that users need to specify up to three arguments - storage hint string, complete name of partition class and partition class arguments string. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (PIG-1375) [Zebra] To support writing multiple Zebra tables through Pig
[ https://issues.apache.org/jira/browse/PIG-1375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12858065#action_12858065 ] Xuefu Zhang commented on PIG-1375: -- Overall, patch looks Okay. However, I suggest the following changes: 1. Don't make unnecessary format (like indention) changes unless you're change that part of the code. I think this is in Apache code guideline. 2. The following if-else can be made clearer. + // If this is a sorted table and key is null (Pig's call path); +if (sortColIndices != null key == null) { + for (int i =0; i sortColIndices.length;++i) { +t.set(i, value.get(sortColIndices[i])); + } + key = builder.generateKey(t); +} else if (key == null) { // for unsorted table; + key = KEY0; +} it can be: if( key == null ) { if( sortColIndices != null ) { ... } else { ... } } They are equivalent, but latter is a little easier to understand. 3. However, the above row-level if-else check should be avoided if possible. I think conf should contain a flag to indicate if key generation is required (so the flag is set only in pig's path). 4. The following object creation is unnecessary. Array should be used directly. ListPath paths = new ArrayListPath(outputs.length); [Zebra] To support writing multiple Zebra tables through Pig Key: PIG-1375 URL: https://issues.apache.org/jira/browse/PIG-1375 Project: Pig Issue Type: New Feature Affects Versions: 0.7.0 Reporter: Chao Wang Assignee: Chao Wang Fix For: 0.8.0 Attachments: PIG-1375.patch, PIG-1375.patch In Zebra, we already have multiple outputs support for map/reduce. But we do not support this feature if users use Zebra through Pig. This jira is to address this issue. We plan to support writing to multiple output tables through Pig as well. We propose to support the following Pig store statements with multiple outputs: store relation into 'loc1,loc2,loc3' using org.apache.hadoop.zebra.pig.TableStorer('storagehint_string', 'complete name of your custom partition class', 'some arguments to partition class'); /* if certain partition class arguments is needed */ store relation into 'loc1,loc2,loc3' using org.apache.hadoop.zebra.pig.TableStorer('storagehint_string', 'complete name of your custom partition class'); /* if no partition class arguments is needed */ Note that users need to specify up to three arguments - storage hint string, complete name of partition class and partition class arguments string. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira