[ 
https://issues.apache.org/jira/browse/PIG-158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12593345#action_12593345
 ] 

Pi Song commented on PIG-158:
-----------------------------

This patch is a monster!!!
Questions and comments for the first 30-40% of the parser:-

1) In parseCogroup,
{code}
+ int arity = gis.get(0).specList.size();
+
  for (int i = 0; i < n ; i++){
 
  CogroupInput gi = gis.get(i);
  los.add(gi.op);
- specs.add(gi.spec);
+ ArrayList<ExpressionOperator> groupByOps = gi.specList;
+ int numGrpByOps = groupByOps.size();
+ log.info("Number of group by operators = " + numGrpByOps);
+
+ if(arity != numGrpByOps) {
+ throw new ParseException("The arity of the group by columns do not match.");
+ }
{code}
so this seems all the cogroup inputs must have the same number of grouped 
columns, right? I think this somewhat lowers the flexibility of the language. 
Not sure what Alan really wants. Another question is if we group by more than 
one column, will the "group" column of the output tuples be tuples?

2) In parseCogroup, I don't quite understand this bit. Initially specs is 
declared as:-
{code}
ArrayList<ArrayList<ExpressionOperator>> specs = new 
ArrayList<ArrayList<ExpressionOperator>>();
{code}
but this is how it is being used:-
{code}
for (LogicalOperator op: specs) {
+ lp.connect(op, cogroup);
{code}

3) In rewriteJoin, the usage of "column" is a bit weird, isn't it?
{code}
  for (int i = 0; i < n; i++) {
- EvalSpec column = new ProjectSpec(i+1);
+ ExpressionOperator column = new LOProject(lp, new OperatorKey(scope, 
getNextId()), gis.get(i).op, -1);
+ ((LOProject)column).setStar(true);
{code}

4) In rewriteJoin, after the creation of cogroup in this line. I don't see it 
is being connected to anything.
LogicalOperator cogroup = parseCogroup(gis, lp);

5) Here why return type is NULL and why do we use LOUserFunc?
{code}
<ANY> {gs = new LOGenerate(lp, new OperatorKey(scope, getNextId()), new 
LOUserFunc(lp, new OperatorKey(scope, getNextId()), GFAny.class.getName(), 
null, DataType.NULL));}
{code}

> Rework logical plan
> -------------------
>
>                 Key: PIG-158
>                 URL: https://issues.apache.org/jira/browse/PIG-158
>             Project: Pig
>          Issue Type: Sub-task
>          Components: impl
>            Reporter: Alan Gates
>            Assignee: Alan Gates
>         Attachments: logical_operators.patch, logical_operators_rev_1.patch, 
> logical_operators_rev_2.patch, logical_operators_rev_3.patch, 
> parser_changes.patch, parser_changes_v1.patch, parser_changes_v2.patch, 
> ParserErrors.txt, visitorWalker.patch
>
>
> Rework the logical plan in line with 
> http://wiki.apache.org/pig/PigExecutionModel

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to