[jira] Commented: (PIG-1375) [Zebra] To support writing multiple Zebra tables through Pig

2010-04-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12858754#action_12858754
 ] 

Hadoop QA commented on PIG-1375:


+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12442232/PIG-1375.patch
  against trunk revision 935101.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 9 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/295/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/295/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/295/console

This message is automatically generated.

 [Zebra] To support writing multiple Zebra tables through Pig
 

 Key: PIG-1375
 URL: https://issues.apache.org/jira/browse/PIG-1375
 Project: Pig
  Issue Type: New Feature
Affects Versions: 0.7.0
Reporter: Chao Wang
Assignee: Chao Wang
 Fix For: 0.8.0

 Attachments: PIG-1375.patch, PIG-1375.patch, PIG-1375.patch


 In Zebra, we already have multiple outputs support for map/reduce.  But we do 
 not support this feature if users use Zebra through Pig.
 This jira is to address this issue. We plan to support writing to multiple 
 output tables through Pig as well.
 We propose to support the following Pig store statements with multiple 
 outputs:
 store relation into 'loc1,loc2,loc3' using 
 org.apache.hadoop.zebra.pig.TableStorer('storagehint_string',
 'complete name of your custom partition class', 'some arguments to partition 
 class'); /* if certain partition class arguments is needed */
 store relation into 'loc1,loc2,loc3' using 
 org.apache.hadoop.zebra.pig.TableStorer('storagehint_string',
 'complete name of your custom partition class'); /* if no partition class 
 arguments is needed */
 Note that users need to specify up to three arguments - storage hint string, 
 complete name of partition class and partition class arguments string.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1375) [Zebra] To support writing multiple Zebra tables through Pig

2010-04-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12858058#action_12858058
 ] 

Hadoop QA commented on PIG-1375:


-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12441981/PIG-1375.patch
  against trunk revision 935046.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 9 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/294/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/294/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/294/console

This message is automatically generated.

 [Zebra] To support writing multiple Zebra tables through Pig
 

 Key: PIG-1375
 URL: https://issues.apache.org/jira/browse/PIG-1375
 Project: Pig
  Issue Type: New Feature
Affects Versions: 0.7.0
Reporter: Chao Wang
Assignee: Chao Wang
 Fix For: 0.8.0

 Attachments: PIG-1375.patch, PIG-1375.patch


 In Zebra, we already have multiple outputs support for map/reduce.  But we do 
 not support this feature if users use Zebra through Pig.
 This jira is to address this issue. We plan to support writing to multiple 
 output tables through Pig as well.
 We propose to support the following Pig store statements with multiple 
 outputs:
 store relation into 'loc1,loc2,loc3' using 
 org.apache.hadoop.zebra.pig.TableStorer('storagehint_string',
 'complete name of your custom partition class', 'some arguments to partition 
 class'); /* if certain partition class arguments is needed */
 store relation into 'loc1,loc2,loc3' using 
 org.apache.hadoop.zebra.pig.TableStorer('storagehint_string',
 'complete name of your custom partition class'); /* if no partition class 
 arguments is needed */
 Note that users need to specify up to three arguments - storage hint string, 
 complete name of partition class and partition class arguments string.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (PIG-1375) [Zebra] To support writing multiple Zebra tables through Pig

2010-04-16 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12858065#action_12858065
 ] 

Xuefu Zhang commented on PIG-1375:
--

Overall, patch looks Okay. However, I suggest the following changes:

1. Don't make unnecessary format (like indention) changes unless you're change 
that part of the code. I think this is in Apache code guideline.

2. The following if-else can be made clearer.

+ // If this is a sorted table and key is null (Pig's call path);
+if (sortColIndices != null  key == null) {
+  for (int i =0; i  sortColIndices.length;++i) {
+t.set(i, value.get(sortColIndices[i]));
+  }
+  key = builder.generateKey(t);
+} else if (key == null) { // for unsorted table;
+  key = KEY0;
+}

it can be: 
if( key == null ) {
if( sortColIndices != null ) {
...
} else {
 ...
}
}

They are equivalent, but latter is a little easier to understand.

3. However, the above row-level if-else check should be avoided if possible. I 
think conf should contain a flag to indicate if key generation is required (so 
the flag is set only in pig's path).

4. The following object creation is unnecessary. Array should be used directly.

ListPath paths = new ArrayListPath(outputs.length);



 [Zebra] To support writing multiple Zebra tables through Pig
 

 Key: PIG-1375
 URL: https://issues.apache.org/jira/browse/PIG-1375
 Project: Pig
  Issue Type: New Feature
Affects Versions: 0.7.0
Reporter: Chao Wang
Assignee: Chao Wang
 Fix For: 0.8.0

 Attachments: PIG-1375.patch, PIG-1375.patch


 In Zebra, we already have multiple outputs support for map/reduce.  But we do 
 not support this feature if users use Zebra through Pig.
 This jira is to address this issue. We plan to support writing to multiple 
 output tables through Pig as well.
 We propose to support the following Pig store statements with multiple 
 outputs:
 store relation into 'loc1,loc2,loc3' using 
 org.apache.hadoop.zebra.pig.TableStorer('storagehint_string',
 'complete name of your custom partition class', 'some arguments to partition 
 class'); /* if certain partition class arguments is needed */
 store relation into 'loc1,loc2,loc3' using 
 org.apache.hadoop.zebra.pig.TableStorer('storagehint_string',
 'complete name of your custom partition class'); /* if no partition class 
 arguments is needed */
 Note that users need to specify up to three arguments - storage hint string, 
 complete name of partition class and partition class arguments string.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira