[
https://issues.apache.org/jira/browse/PIG-158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12593345#action_12593345
]
Pi Song commented on PIG-158:
-----------------------------
This patch is a monster!!!
Questions and comments for the first 30-40% of the parser:-
1) In parseCogroup,
{code}
+ int arity = gis.get(0).specList.size();
+
for (int i = 0; i < n ; i++){
CogroupInput gi = gis.get(i);
los.add(gi.op);
- specs.add(gi.spec);
+ ArrayList<ExpressionOperator> groupByOps = gi.specList;
+ int numGrpByOps = groupByOps.size();
+ log.info("Number of group by operators = " + numGrpByOps);
+
+ if(arity != numGrpByOps) {
+ throw new ParseException("The arity of the group by columns do not match.");
+ }
{code}
so this seems all the cogroup inputs must have the same number of grouped
columns, right? I think this somewhat lowers the flexibility of the language.
Not sure what Alan really wants. Another question is if we group by more than
one column, will the "group" column of the output tuples be tuples?
2) In parseCogroup, I don't quite understand this bit. Initially specs is
declared as:-
{code}
ArrayList<ArrayList<ExpressionOperator>> specs = new
ArrayList<ArrayList<ExpressionOperator>>();
{code}
but this is how it is being used:-
{code}
for (LogicalOperator op: specs) {
+ lp.connect(op, cogroup);
{code}
3) In rewriteJoin, the usage of "column" is a bit weird, isn't it?
{code}
for (int i = 0; i < n; i++) {
- EvalSpec column = new ProjectSpec(i+1);
+ ExpressionOperator column = new LOProject(lp, new OperatorKey(scope,
getNextId()), gis.get(i).op, -1);
+ ((LOProject)column).setStar(true);
{code}
4) In rewriteJoin, after the creation of cogroup in this line. I don't see it
is being connected to anything.
LogicalOperator cogroup = parseCogroup(gis, lp);
5) Here why return type is NULL and why do we use LOUserFunc?
{code}
<ANY> {gs = new LOGenerate(lp, new OperatorKey(scope, getNextId()), new
LOUserFunc(lp, new OperatorKey(scope, getNextId()), GFAny.class.getName(),
null, DataType.NULL));}
{code}
> Rework logical plan
> -------------------
>
> Key: PIG-158
> URL: https://issues.apache.org/jira/browse/PIG-158
> Project: Pig
> Issue Type: Sub-task
> Components: impl
> Reporter: Alan Gates
> Assignee: Alan Gates
> Attachments: logical_operators.patch, logical_operators_rev_1.patch,
> logical_operators_rev_2.patch, logical_operators_rev_3.patch,
> parser_changes.patch, parser_changes_v1.patch, parser_changes_v2.patch,
> ParserErrors.txt, visitorWalker.patch
>
>
> Rework the logical plan in line with
> http://wiki.apache.org/pig/PigExecutionModel
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.