[
https://issues.apache.org/jira/browse/PIG-1375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12858065#action_12858065
]
Xuefu Zhang commented on PIG-1375:
----------------------------------
Overall, patch looks Okay. However, I suggest the following changes:
1. Don't make unnecessary format (like indention) changes unless you're change
that part of the code. I think this is in Apache code guideline.
2. The following if-else can be made clearer.
+ // If this is a sorted table and key is null (Pig's call path);
+ if (sortColIndices != null && key == null) {
+ for (int i =0; i < sortColIndices.length;++i) {
+ t.set(i, value.get(sortColIndices[i]));
+ }
+ key = builder.generateKey(t);
+ } else if (key == null) { // for unsorted table;
+ key = KEY0;
+ }
it can be:
if( key == null ) {
if( sortColIndices != null ) {
...
} else {
...
}
}
They are equivalent, but latter is a little easier to understand.
3. However, the above row-level if-else check should be avoided if possible. I
think conf should contain a flag to indicate if key generation is required (so
the flag is set only in pig's path).
4. The following object creation is unnecessary. Array should be used directly.
List<Path> paths = new ArrayList<Path>(outputs.length);
> [Zebra] To support writing multiple Zebra tables through Pig
> ------------------------------------------------------------
>
> Key: PIG-1375
> URL: https://issues.apache.org/jira/browse/PIG-1375
> Project: Pig
> Issue Type: New Feature
> Affects Versions: 0.7.0
> Reporter: Chao Wang
> Assignee: Chao Wang
> Fix For: 0.8.0
>
> Attachments: PIG-1375.patch, PIG-1375.patch
>
>
> In Zebra, we already have multiple outputs support for map/reduce. But we do
> not support this feature if users use Zebra through Pig.
> This jira is to address this issue. We plan to support writing to multiple
> output tables through Pig as well.
> We propose to support the following Pig store statements with multiple
> outputs:
> store relation into 'loc1,loc2,loc3....' using
> org.apache.hadoop.zebra.pig.TableStorer('storagehint_string',
> 'complete name of your custom partition class', 'some arguments to partition
> class'); /* if certain partition class arguments is needed */
> store relation into 'loc1,loc2,loc3....' using
> org.apache.hadoop.zebra.pig.TableStorer('storagehint_string',
> 'complete name of your custom partition class'); /* if no partition class
> arguments is needed */
> Note that users need to specify up to three arguments - storage hint string,
> complete name of partition class and partition class arguments string.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira