[ 
https://issues.apache.org/jira/browse/PIG-158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12581744#action_12581744
 ] 

Santhosh Srinivasan commented on PIG-158:
-----------------------------------------

I have addressed most of the comments. I have questions regarding some of the 
comments. Replies are inline with [Santhosh]. Thanks for the comments. 

In LOSort.getSchema, LOSort should always have one and only one parent, so all 
the checking around s.iterator().getNext() isn't necessary. Just do 
s.iterator().next(), and then assert if you get back a null. Same goes for 
LODistinct.

[Santhosh] assert will work only if the -ea switch is used at run time. I was 
inclined towards the check in code instead of relying on the runtime switch.

+++

LOMapLookup: This should be an ExpressionOperator, rather than a direct 
extender of LogicalOperator. Also, it should only take one value, not an array. 
For example, there is a query like:

a = load 'myfile' as mymap map;
b = foreach a generate mymap.myfirstkey, mymap.mysecondkey;

then the resulting logical plan should have a LOGenerate operator with two 
expressions, a LOMapLookup that has map of mymap and a key of myfirstkey, and a 
LOMapLookup that has map of mymap, and key of mysecondkey.

[Santhosh] I have changed LOMapLookup to have the key instead of the list of 
keys. Should the map be an attribute of LOMapLookup or is it provided by the 
LogicalPlan?

+++

LOCogroup: I think the group by cols array can be an array of 
ExpressionOperators. You could envision grouping on a transformation of the 
columns, but not on a relational operator.

I don't think the LOCogroup.getSchema method is correct. The schema that 
results from cogroup will be group, bag1, bag2, ... Group may or may not be a 
tuple, depending on how many group by keys there are. The other columns are 
bags with tuples from each of the relations being grouped. So if you have

a = group b by name;

then the resulting schema (assuming name is of type bytearray) is: (bytearray, 
bag).

If you have

a = cogroup b by name, c by name;

then the resulting schema is (tuple, bag, bag).

[Santhosh] I still don't see why the current code does not handle your example 
(the part of re-cociling the schema for the group is a TODO)

+++

Why did LOUnion get totally commented out? We still need an LOUnion. Same goes 
in LOVisitor.visitUnion().

[Santhosh] I probably missed the svn add. Its added now.

+++

LOGenerates mProjections array should be an array of ExpressionOperators.

[Santhosh] I changed the array to ExpressionOperators. I also have a LOProject 
for dereferencing tuples. The new sources needs a review.

> Rework logical plan
> -------------------
>
>                 Key: PIG-158
>                 URL: https://issues.apache.org/jira/browse/PIG-158
>             Project: Pig
>          Issue Type: Sub-task
>          Components: impl
>            Reporter: Alan Gates
>            Assignee: Alan Gates
>         Attachments: logical_operators.patch
>
>
> Rework the logical plan in line with 
> http://wiki.apache.org/pig/PigExecutionModel

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to