[jira] Commented: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with

2010-09-07 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12907061#action_12907061
 ] 

Daniel Dai commented on PIG-1178:
-

PIG-1178-11.patch committed to both trunk and 0.8 branch. 

 LogicalPlan and Optimizer are too complex and hard to work with
 ---

 Key: PIG-1178
 URL: https://issues.apache.org/jira/browse/PIG-1178
 Project: Pig
  Issue Type: Improvement
Reporter: Alan Gates
Assignee: Daniel Dai
 Fix For: 0.8.0

 Attachments: expressions-2.patch, expressions.patch, lp.patch, 
 lp.patch, PIG-1178-10.patch, PIG-1178-11.patch, PIG-1178-4.patch, 
 PIG-1178-5.patch, PIG-1178-6.patch, PIG-1178-7.patch, PIG-1178-8.patch, 
 PIG-1178-9.patch, pig_1178.patch, pig_1178.patch, PIG_1178.patch, 
 pig_1178_2.patch, pig_1178_3.2.patch, pig_1178_3.3.patch, pig_1178_3.4.patch, 
 pig_1178_3.patch


 The current implementation of the logical plan and the logical optimizer in 
 Pig has proven to not be easily extensible. Developer feedback has indicated 
 that adding new rules to the optimizer is quite burdensome. In addition, the 
 logical plan has been an area of numerous bugs, many of which have been 
 difficult to fix. Developers also feel that the logical plan is difficult to 
 understand and maintain. The root cause for these issues is that a number of 
 design decisions that were made as part of the 0.2 rewrite of the front end 
 have now proven to be sub-optimal. The heart of this proposal is to revisit a 
 number of those proposals and rebuild the logical plan with a simpler design 
 that will make it much easier to maintain the logical plan as well as extend 
 the logical optimizer. 
 See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full 
 details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with

2010-09-06 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12906592#action_12906592
 ] 

Daniel Dai commented on PIG-1178:
-

Patch PIG-1178-10.patch committed.

 LogicalPlan and Optimizer are too complex and hard to work with
 ---

 Key: PIG-1178
 URL: https://issues.apache.org/jira/browse/PIG-1178
 Project: Pig
  Issue Type: Improvement
Reporter: Alan Gates
Assignee: Daniel Dai
 Fix For: 0.8.0

 Attachments: expressions-2.patch, expressions.patch, lp.patch, 
 lp.patch, PIG-1178-10.patch, PIG-1178-4.patch, PIG-1178-5.patch, 
 PIG-1178-6.patch, PIG-1178-7.patch, PIG-1178-8.patch, PIG-1178-9.patch, 
 pig_1178.patch, pig_1178.patch, PIG_1178.patch, pig_1178_2.patch, 
 pig_1178_3.2.patch, pig_1178_3.3.patch, pig_1178_3.4.patch, pig_1178_3.patch


 The current implementation of the logical plan and the logical optimizer in 
 Pig has proven to not be easily extensible. Developer feedback has indicated 
 that adding new rules to the optimizer is quite burdensome. In addition, the 
 logical plan has been an area of numerous bugs, many of which have been 
 difficult to fix. Developers also feel that the logical plan is difficult to 
 understand and maintain. The root cause for these issues is that a number of 
 design decisions that were made as part of the 0.2 rewrite of the front end 
 have now proven to be sub-optimal. The heart of this proposal is to revisit a 
 number of those proposals and rebuild the logical plan with a simpler design 
 that will make it much easier to maintain the logical plan as well as extend 
 the logical optimizer. 
 See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full 
 details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with

2010-08-28 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12903803#action_12903803
 ] 

Daniel Dai commented on PIG-1178:
-

test-patch result for PIG-11780-8:

 [exec] +1 overall.
 [exec]
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec]
 [exec] +1 tests included.  The patch appears to include 3 new or 
modified tests.
 [exec]
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec]
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec]
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
 [exec]
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.

Patch committed.

 LogicalPlan and Optimizer are too complex and hard to work with
 ---

 Key: PIG-1178
 URL: https://issues.apache.org/jira/browse/PIG-1178
 Project: Pig
  Issue Type: Improvement
Reporter: Alan Gates
Assignee: Daniel Dai
 Fix For: 0.8.0

 Attachments: expressions-2.patch, expressions.patch, lp.patch, 
 lp.patch, PIG-1178-4.patch, PIG-1178-5.patch, PIG-1178-6.patch, 
 PIG-1178-7.patch, PIG-1178-8.patch, pig_1178.patch, pig_1178.patch, 
 PIG_1178.patch, pig_1178_2.patch, pig_1178_3.2.patch, pig_1178_3.3.patch, 
 pig_1178_3.4.patch, pig_1178_3.patch


 The current implementation of the logical plan and the logical optimizer in 
 Pig has proven to not be easily extensible. Developer feedback has indicated 
 that adding new rules to the optimizer is quite burdensome. In addition, the 
 logical plan has been an area of numerous bugs, many of which have been 
 difficult to fix. Developers also feel that the logical plan is difficult to 
 understand and maintain. The root cause for these issues is that a number of 
 design decisions that were made as part of the 0.2 rewrite of the front end 
 have now proven to be sub-optimal. The heart of this proposal is to revisit a 
 number of those proposals and rebuild the logical plan with a simpler design 
 that will make it much easier to maintain the logical plan as well as extend 
 the logical optimizer. 
 See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full 
 details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with

2010-08-23 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12901543#action_12901543
 ] 

Daniel Dai commented on PIG-1178:
-

PIG-1178-7.patch committed.

 LogicalPlan and Optimizer are too complex and hard to work with
 ---

 Key: PIG-1178
 URL: https://issues.apache.org/jira/browse/PIG-1178
 Project: Pig
  Issue Type: Improvement
Reporter: Alan Gates
Assignee: Daniel Dai
 Fix For: 0.8.0

 Attachments: expressions-2.patch, expressions.patch, lp.patch, 
 lp.patch, PIG-1178-4.patch, PIG-1178-5.patch, PIG-1178-6.patch, 
 PIG-1178-7.patch, pig_1178.patch, pig_1178.patch, PIG_1178.patch, 
 pig_1178_2.patch, pig_1178_3.2.patch, pig_1178_3.3.patch, pig_1178_3.4.patch, 
 pig_1178_3.patch


 The current implementation of the logical plan and the logical optimizer in 
 Pig has proven to not be easily extensible. Developer feedback has indicated 
 that adding new rules to the optimizer is quite burdensome. In addition, the 
 logical plan has been an area of numerous bugs, many of which have been 
 difficult to fix. Developers also feel that the logical plan is difficult to 
 understand and maintain. The root cause for these issues is that a number of 
 design decisions that were made as part of the 0.2 rewrite of the front end 
 have now proven to be sub-optimal. The heart of this proposal is to revisit a 
 number of those proposals and rebuild the logical plan with a simpler design 
 that will make it much easier to maintain the logical plan as well as extend 
 the logical optimizer. 
 See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full 
 details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with

2010-08-05 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12895771#action_12895771
 ] 

Daniel Dai commented on PIG-1178:
-

PIG-1178-6.patch committed.

 LogicalPlan and Optimizer are too complex and hard to work with
 ---

 Key: PIG-1178
 URL: https://issues.apache.org/jira/browse/PIG-1178
 Project: Pig
  Issue Type: Improvement
Reporter: Alan Gates
Assignee: Daniel Dai
 Fix For: 0.8.0

 Attachments: expressions-2.patch, expressions.patch, lp.patch, 
 lp.patch, PIG-1178-4.patch, PIG-1178-5.patch, PIG-1178-6.patch, 
 pig_1178.patch, pig_1178.patch, PIG_1178.patch, pig_1178_2.patch, 
 pig_1178_3.2.patch, pig_1178_3.3.patch, pig_1178_3.4.patch, pig_1178_3.patch


 The current implementation of the logical plan and the logical optimizer in 
 Pig has proven to not be easily extensible. Developer feedback has indicated 
 that adding new rules to the optimizer is quite burdensome. In addition, the 
 logical plan has been an area of numerous bugs, many of which have been 
 difficult to fix. Developers also feel that the logical plan is difficult to 
 understand and maintain. The root cause for these issues is that a number of 
 design decisions that were made as part of the 0.2 rewrite of the front end 
 have now proven to be sub-optimal. The heart of this proposal is to revisit a 
 number of those proposals and rebuild the logical plan with a simpler design 
 that will make it much easier to maintain the logical plan as well as extend 
 the logical optimizer. 
 See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full 
 details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with

2010-08-04 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12895165#action_12895165
 ] 

Daniel Dai commented on PIG-1178:
-

Did some restructure and bug fixing. Also move package from experimental to 
newplan.

 LogicalPlan and Optimizer are too complex and hard to work with
 ---

 Key: PIG-1178
 URL: https://issues.apache.org/jira/browse/PIG-1178
 Project: Pig
  Issue Type: Improvement
Reporter: Alan Gates
Assignee: Daniel Dai
 Fix For: 0.8.0

 Attachments: expressions-2.patch, expressions.patch, lp.patch, 
 lp.patch, PIG-1178-4.patch, PIG-1178-5.patch, pig_1178.patch, pig_1178.patch, 
 PIG_1178.patch, pig_1178_2.patch, pig_1178_3.2.patch, pig_1178_3.3.patch, 
 pig_1178_3.4.patch, pig_1178_3.patch


 The current implementation of the logical plan and the logical optimizer in 
 Pig has proven to not be easily extensible. Developer feedback has indicated 
 that adding new rules to the optimizer is quite burdensome. In addition, the 
 logical plan has been an area of numerous bugs, many of which have been 
 difficult to fix. Developers also feel that the logical plan is difficult to 
 understand and maintain. The root cause for these issues is that a number of 
 design decisions that were made as part of the 0.2 rewrite of the front end 
 have now proven to be sub-optimal. The heart of this proposal is to revisit a 
 number of those proposals and rebuild the logical plan with a simpler design 
 that will make it much easier to maintain the logical plan as well as extend 
 the logical optimizer. 
 See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full 
 details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with

2010-08-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12895463#action_12895463
 ] 

Hadoop QA commented on PIG-1178:


-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12451203/PIG-1178-5.patch
  against trunk revision 982423.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 91 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/375/console

This message is automatically generated.

 LogicalPlan and Optimizer are too complex and hard to work with
 ---

 Key: PIG-1178
 URL: https://issues.apache.org/jira/browse/PIG-1178
 Project: Pig
  Issue Type: Improvement
Reporter: Alan Gates
Assignee: Daniel Dai
 Fix For: 0.8.0

 Attachments: expressions-2.patch, expressions.patch, lp.patch, 
 lp.patch, PIG-1178-4.patch, PIG-1178-5.patch, pig_1178.patch, pig_1178.patch, 
 PIG_1178.patch, pig_1178_2.patch, pig_1178_3.2.patch, pig_1178_3.3.patch, 
 pig_1178_3.4.patch, pig_1178_3.patch


 The current implementation of the logical plan and the logical optimizer in 
 Pig has proven to not be easily extensible. Developer feedback has indicated 
 that adding new rules to the optimizer is quite burdensome. In addition, the 
 logical plan has been an area of numerous bugs, many of which have been 
 difficult to fix. Developers also feel that the logical plan is difficult to 
 understand and maintain. The root cause for these issues is that a number of 
 design decisions that were made as part of the 0.2 rewrite of the front end 
 have now proven to be sub-optimal. The heart of this proposal is to revisit a 
 number of those proposals and rebuild the logical plan with a simpler design 
 that will make it much easier to maintain the logical plan as well as extend 
 the logical optimizer. 
 See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full 
 details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with

2010-07-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12892658#action_12892658
 ] 

Hadoop QA commented on PIG-1178:


-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12450250/PIG-1178-4.patch
  against trunk revision 979362.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 48 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

-1 release audit.  The applied patch generated 446 release audit warnings 
(more than the trunk's current 398 warnings).

-1 core tests.  The patch failed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/355/testReport/
Release audit warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/355/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/355/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/355/console

This message is automatically generated.

 LogicalPlan and Optimizer are too complex and hard to work with
 ---

 Key: PIG-1178
 URL: https://issues.apache.org/jira/browse/PIG-1178
 Project: Pig
  Issue Type: Improvement
Reporter: Alan Gates
Assignee: Daniel Dai
 Fix For: 0.8.0

 Attachments: expressions-2.patch, expressions.patch, lp.patch, 
 lp.patch, PIG-1178-4.patch, pig_1178.patch, pig_1178.patch, PIG_1178.patch, 
 pig_1178_2.patch, pig_1178_3.2.patch, pig_1178_3.3.patch, pig_1178_3.4.patch, 
 pig_1178_3.patch


 The current implementation of the logical plan and the logical optimizer in 
 Pig has proven to not be easily extensible. Developer feedback has indicated 
 that adding new rules to the optimizer is quite burdensome. In addition, the 
 logical plan has been an area of numerous bugs, many of which have been 
 difficult to fix. Developers also feel that the logical plan is difficult to 
 understand and maintain. The root cause for these issues is that a number of 
 design decisions that were made as part of the 0.2 rewrite of the front end 
 have now proven to be sub-optimal. The heart of this proposal is to revisit a 
 number of those proposals and rebuild the logical plan with a simpler design 
 that will make it much easier to maintain the logical plan as well as extend 
 the logical optimizer. 
 See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full 
 details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with

2010-07-27 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12892960#action_12892960
 ] 

Alan Gates commented on PIG-1178:
-

12K lines of code, wow!

In ProjectExpression, what does the new attachedRelationalOp do?  javadoc 
comments on that in the constructor would be good.

Is the purpose of this patch to also make this the default optimizer or leave 
it in experimental mode?  If it is to make it the default, I think we should 
move it from the experimental package, though probably in a separate patch.  If 
it isn't, any thoughts on when it would be ready to become the default 
optimizer?



 LogicalPlan and Optimizer are too complex and hard to work with
 ---

 Key: PIG-1178
 URL: https://issues.apache.org/jira/browse/PIG-1178
 Project: Pig
  Issue Type: Improvement
Reporter: Alan Gates
Assignee: Daniel Dai
 Fix For: 0.8.0

 Attachments: expressions-2.patch, expressions.patch, lp.patch, 
 lp.patch, PIG-1178-4.patch, pig_1178.patch, pig_1178.patch, PIG_1178.patch, 
 pig_1178_2.patch, pig_1178_3.2.patch, pig_1178_3.3.patch, pig_1178_3.4.patch, 
 pig_1178_3.patch


 The current implementation of the logical plan and the logical optimizer in 
 Pig has proven to not be easily extensible. Developer feedback has indicated 
 that adding new rules to the optimizer is quite burdensome. In addition, the 
 logical plan has been an area of numerous bugs, many of which have been 
 difficult to fix. Developers also feel that the logical plan is difficult to 
 understand and maintain. The root cause for these issues is that a number of 
 design decisions that were made as part of the 0.2 rewrite of the front end 
 have now proven to be sub-optimal. The heart of this proposal is to revisit a 
 number of those proposals and rebuild the logical plan with a simpler design 
 that will make it much easier to maintain the logical plan as well as extend 
 the logical optimizer. 
 See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full 
 details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with

2010-07-22 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12891438#action_12891438
 ] 

Daniel Dai commented on PIG-1178:
-

Attach PIG-1178-4.patch, include change of the following area:
1. Add all the relational operators
2. Add foreach nested plans
3. Add field schema to expression operators
4. Remove UidStamp, instead, uid will be generated and cached first time we get 
fieldschema
5. Fix column pruner and all other new logical plan test cases
6. Add TypeCastInserter

Still polishing and refactory the code.

 LogicalPlan and Optimizer are too complex and hard to work with
 ---

 Key: PIG-1178
 URL: https://issues.apache.org/jira/browse/PIG-1178
 Project: Pig
  Issue Type: Improvement
Reporter: Alan Gates
Assignee: Daniel Dai
 Fix For: 0.8.0

 Attachments: expressions-2.patch, expressions.patch, lp.patch, 
 lp.patch, PIG-1178-4.patch, pig_1178.patch, pig_1178.patch, PIG_1178.patch, 
 pig_1178_2.patch, pig_1178_3.2.patch, pig_1178_3.3.patch, pig_1178_3.4.patch, 
 pig_1178_3.patch


 The current implementation of the logical plan and the logical optimizer in 
 Pig has proven to not be easily extensible. Developer feedback has indicated 
 that adding new rules to the optimizer is quite burdensome. In addition, the 
 logical plan has been an area of numerous bugs, many of which have been 
 difficult to fix. Developers also feel that the logical plan is difficult to 
 understand and maintain. The root cause for these issues is that a number of 
 design decisions that were made as part of the 0.2 rewrite of the front end 
 have now proven to be sub-optimal. The heart of this proposal is to revisit a 
 number of those proposals and rebuild the logical plan with a simpler design 
 that will make it much easier to maintain the logical plan as well as extend 
 the logical optimizer. 
 See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full 
 details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with

2010-04-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12855486#action_12855486
 ] 

Hadoop QA commented on PIG-1178:


-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12441315/pig_1178_3.4.patch
  against trunk revision 932472.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 6 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to cause Findbugs to fail.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/281/testReport/
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/281/console

This message is automatically generated.

 LogicalPlan and Optimizer are too complex and hard to work with
 ---

 Key: PIG-1178
 URL: https://issues.apache.org/jira/browse/PIG-1178
 Project: Pig
  Issue Type: Improvement
Reporter: Alan Gates
Assignee: Daniel Dai
 Attachments: expressions-2.patch, expressions.patch, lp.patch, 
 lp.patch, pig_1178.patch, pig_1178.patch, PIG_1178.patch, pig_1178_2.patch, 
 pig_1178_3.2.patch, pig_1178_3.3.patch, pig_1178_3.4.patch, pig_1178_3.patch


 The current implementation of the logical plan and the logical optimizer in 
 Pig has proven to not be easily extensible. Developer feedback has indicated 
 that adding new rules to the optimizer is quite burdensome. In addition, the 
 logical plan has been an area of numerous bugs, many of which have been 
 difficult to fix. Developers also feel that the logical plan is difficult to 
 understand and maintain. The root cause for these issues is that a number of 
 design decisions that were made as part of the 0.2 rewrite of the front end 
 have now proven to be sub-optimal. The heart of this proposal is to revisit a 
 number of those proposals and rebuild the logical plan with a simpler design 
 that will make it much easier to maintain the logical plan as well as extend 
 the logical optimizer. 
 See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full 
 details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with

2010-03-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12845083#action_12845083
 ] 

Hadoop QA commented on PIG-1178:


-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12438738/pig_1178_3.3.patch
  against trunk revision 922664.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 28 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/251/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/251/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/251/console

This message is automatically generated.

 LogicalPlan and Optimizer are too complex and hard to work with
 ---

 Key: PIG-1178
 URL: https://issues.apache.org/jira/browse/PIG-1178
 Project: Pig
  Issue Type: Improvement
Reporter: Alan Gates
Assignee: Daniel Dai
 Attachments: expressions-2.patch, expressions.patch, lp.patch, 
 lp.patch, pig_1178.patch, pig_1178.patch, PIG_1178.patch, pig_1178_2.patch, 
 pig_1178_3.2.patch, pig_1178_3.3.patch, pig_1178_3.patch


 The current implementation of the logical plan and the logical optimizer in 
 Pig has proven to not be easily extensible. Developer feedback has indicated 
 that adding new rules to the optimizer is quite burdensome. In addition, the 
 logical plan has been an area of numerous bugs, many of which have been 
 difficult to fix. Developers also feel that the logical plan is difficult to 
 understand and maintain. The root cause for these issues is that a number of 
 design decisions that were made as part of the 0.2 rewrite of the front end 
 have now proven to be sub-optimal. The heart of this proposal is to revisit a 
 number of those proposals and rebuild the logical plan with a simpler design 
 that will make it much easier to maintain the logical plan as well as extend 
 the logical optimizer. 
 See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full 
 details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with

2010-03-14 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12845173#action_12845173
 ] 

Daniel Dai commented on PIG-1178:
-

pig_1178_3.3.patch committed. Manual unit pass.

 LogicalPlan and Optimizer are too complex and hard to work with
 ---

 Key: PIG-1178
 URL: https://issues.apache.org/jira/browse/PIG-1178
 Project: Pig
  Issue Type: Improvement
Reporter: Alan Gates
Assignee: Daniel Dai
 Attachments: expressions-2.patch, expressions.patch, lp.patch, 
 lp.patch, pig_1178.patch, pig_1178.patch, PIG_1178.patch, pig_1178_2.patch, 
 pig_1178_3.2.patch, pig_1178_3.3.patch, pig_1178_3.patch


 The current implementation of the logical plan and the logical optimizer in 
 Pig has proven to not be easily extensible. Developer feedback has indicated 
 that adding new rules to the optimizer is quite burdensome. In addition, the 
 logical plan has been an area of numerous bugs, many of which have been 
 difficult to fix. Developers also feel that the logical plan is difficult to 
 understand and maintain. The root cause for these issues is that a number of 
 design decisions that were made as part of the 0.2 rewrite of the front end 
 have now proven to be sub-optimal. The heart of this proposal is to revisit a 
 number of those proposals and rebuild the logical plan with a simpler design 
 that will make it much easier to maintain the logical plan as well as extend 
 the logical optimizer. 
 See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full 
 details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with

2010-03-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12845023#action_12845023
 ] 

Hadoop QA commented on PIG-1178:


-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12438710/pig_1178_3.2.patch
  against trunk revision 922664.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 28 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 1 new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/250/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/250/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/250/console

This message is automatically generated.

 LogicalPlan and Optimizer are too complex and hard to work with
 ---

 Key: PIG-1178
 URL: https://issues.apache.org/jira/browse/PIG-1178
 Project: Pig
  Issue Type: Improvement
Reporter: Alan Gates
Assignee: Daniel Dai
 Attachments: expressions-2.patch, expressions.patch, lp.patch, 
 lp.patch, pig_1178.patch, pig_1178.patch, PIG_1178.patch, pig_1178_2.patch, 
 pig_1178_3.2.patch, pig_1178_3.patch


 The current implementation of the logical plan and the logical optimizer in 
 Pig has proven to not be easily extensible. Developer feedback has indicated 
 that adding new rules to the optimizer is quite burdensome. In addition, the 
 logical plan has been an area of numerous bugs, many of which have been 
 difficult to fix. Developers also feel that the logical plan is difficult to 
 understand and maintain. The root cause for these issues is that a number of 
 design decisions that were made as part of the 0.2 rewrite of the front end 
 have now proven to be sub-optimal. The heart of this proposal is to revisit a 
 number of those proposals and rebuild the logical plan with a simpler design 
 that will make it much easier to maintain the logical plan as well as extend 
 the logical optimizer. 
 See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full 
 details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with

2010-02-24 Thread Ankit Modi (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12837954#action_12837954
 ] 

Ankit Modi commented on PIG-1178:
-

the core tests are failing due to some issue with hudson or the framework.

I ran the core tests again yesterday night and they passed. 

 LogicalPlan and Optimizer are too complex and hard to work with
 ---

 Key: PIG-1178
 URL: https://issues.apache.org/jira/browse/PIG-1178
 Project: Pig
  Issue Type: Improvement
Reporter: Alan Gates
Assignee: Ying He
 Attachments: expressions-2.patch, expressions.patch, lp.patch, 
 lp.patch, pig_1178.patch, pig_1178.patch, PIG_1178.patch, pig_1178_2.patch


 The current implementation of the logical plan and the logical optimizer in 
 Pig has proven to not be easily extensible. Developer feedback has indicated 
 that adding new rules to the optimizer is quite burdensome. In addition, the 
 logical plan has been an area of numerous bugs, many of which have been 
 difficult to fix. Developers also feel that the logical plan is difficult to 
 understand and maintain. The root cause for these issues is that a number of 
 design decisions that were made as part of the 0.2 rewrite of the front end 
 have now proven to be sub-optimal. The heart of this proposal is to revisit a 
 number of those proposals and rebuild the logical plan with a simpler design 
 that will make it much easier to maintain the logical plan as well as extend 
 the logical optimizer. 
 See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full 
 details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with

2010-02-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12837196#action_12837196
 ] 

Hadoop QA commented on PIG-1178:


-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12436606/pig_1178_2.patch
  against trunk revision 915079.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 25 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated 1 warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/216/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/216/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/216/console

This message is automatically generated.

 LogicalPlan and Optimizer are too complex and hard to work with
 ---

 Key: PIG-1178
 URL: https://issues.apache.org/jira/browse/PIG-1178
 Project: Pig
  Issue Type: Improvement
Reporter: Alan Gates
Assignee: Ying He
 Attachments: expressions-2.patch, expressions.patch, lp.patch, 
 lp.patch, pig_1178.patch, pig_1178.patch, PIG_1178.patch, pig_1178_2.patch


 The current implementation of the logical plan and the logical optimizer in 
 Pig has proven to not be easily extensible. Developer feedback has indicated 
 that adding new rules to the optimizer is quite burdensome. In addition, the 
 logical plan has been an area of numerous bugs, many of which have been 
 difficult to fix. Developers also feel that the logical plan is difficult to 
 understand and maintain. The root cause for these issues is that a number of 
 design decisions that were made as part of the 0.2 rewrite of the front end 
 have now proven to be sub-optimal. The heart of this proposal is to revisit a 
 number of those proposals and rebuild the logical plan with a simpler design 
 that will make it much easier to maintain the logical plan as well as extend 
 the logical optimizer. 
 See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full 
 details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with

2010-02-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12837062#action_12837062
 ] 

Hadoop QA commented on PIG-1178:


-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12436606/pig_1178_2.patch
  against trunk revision 912064.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 25 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated 1 warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/211/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/211/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/211/console

This message is automatically generated.

 LogicalPlan and Optimizer are too complex and hard to work with
 ---

 Key: PIG-1178
 URL: https://issues.apache.org/jira/browse/PIG-1178
 Project: Pig
  Issue Type: Improvement
Reporter: Alan Gates
Assignee: Ying He
 Attachments: expressions-2.patch, expressions.patch, lp.patch, 
 lp.patch, pig_1178.patch, pig_1178.patch, PIG_1178.patch, pig_1178_2.patch


 The current implementation of the logical plan and the logical optimizer in 
 Pig has proven to not be easily extensible. Developer feedback has indicated 
 that adding new rules to the optimizer is quite burdensome. In addition, the 
 logical plan has been an area of numerous bugs, many of which have been 
 difficult to fix. Developers also feel that the logical plan is difficult to 
 understand and maintain. The root cause for these issues is that a number of 
 design decisions that were made as part of the 0.2 rewrite of the front end 
 have now proven to be sub-optimal. The heart of this proposal is to revisit a 
 number of those proposals and rebuild the logical plan with a simpler design 
 that will make it much easier to maintain the logical plan as well as extend 
 the logical optimizer. 
 See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full 
 details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with

2010-02-10 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12832072#action_12832072
 ] 

Alan Gates commented on PIG-1178:
-

Responses to Ashutosh's questions and comments:

bq. b) At various places Runtime Exceptions are thrown. Do we need a new 
Exception class something like ExperimentalOptimizerException so as to easily 
spot those exceptions? That will aid in debugging. Also error messages are 
pretty terse. More details can be put in messages, again to aid in debugging 
later on.

The error messages are sparse.  Part of moving this from prototype to 
production will be fleshing out those error messages, adding error codes, etc.  

As for adding a new exception, I'm not sure I see the value.  Hopefully 
RuntimeException is only used in appropriate places.  If it's being used where 
a different exception should be used, then we can change it.

bq. c) At couple of places, changes are made outside of experimental package. 
Will be useful to put comments there for why are those needed. 

Changes were made to FuncSpec and FileSpec to give them equals functions so 
that comparisons can be done between two nodes that both contain a FuncSpec.  
For example, we may want to determine if two load operators are the same.  Part 
of that will be determining if they use the same load function and load the 
same file.

bq. Was wondering about different optimizations that we do on a complied MR 
plan. ... [E]ssentially those optimizations are also done through visitors and 
would benefit greatly if there is a framework for them just as there is one for 
front-end. Is there any plan to also subsume those visitors (possibly by 
rewriting them as rule-transform pairs) in this new optimizer or will they be 
dealt with separately later on?

Plan is too strong a word; there is a hope.  I agree that the MR optimizer is a 
mess, and it needs a framework.  The question is whether the same framework 
will suffice for both.  I hope that it will.  But to avoid taking on too much 
we have defined the scope of this current work to just be the logical 
optimizer.  After we have the logical optimizer in good shape we will need to 
address issues in the MR optimizer.  Hopefully this framework being developed 
will work with minimal refactoring and changes.  If not, a different framework 
will need to be designed for the MR case.



 LogicalPlan and Optimizer are too complex and hard to work with
 ---

 Key: PIG-1178
 URL: https://issues.apache.org/jira/browse/PIG-1178
 Project: Pig
  Issue Type: Improvement
Reporter: Alan Gates
Assignee: Ying He
 Attachments: expressions-2.patch, expressions.patch, lp.patch, 
 lp.patch, pig_1178.patch, pig_1178.patch, PIG_1178.patch


 The current implementation of the logical plan and the logical optimizer in 
 Pig has proven to not be easily extensible. Developer feedback has indicated 
 that adding new rules to the optimizer is quite burdensome. In addition, the 
 logical plan has been an area of numerous bugs, many of which have been 
 difficult to fix. Developers also feel that the logical plan is difficult to 
 understand and maintain. The root cause for these issues is that a number of 
 design decisions that were made as part of the 0.2 rewrite of the front end 
 have now proven to be sub-optimal. The heart of this proposal is to revisit a 
 number of those proposals and rebuild the logical plan with a simpler design 
 that will make it much easier to maintain the logical plan as well as extend 
 the logical optimizer. 
 See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full 
 details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with

2010-02-10 Thread Ying He (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12832164#action_12832164
 ] 

Ying He commented on PIG-1178:
--

for the annotation resetting, I think it can be implemented as a 
PlanTransformListener. The listener has access to the plan and can reset every 
node, given the order is not important.

 LogicalPlan and Optimizer are too complex and hard to work with
 ---

 Key: PIG-1178
 URL: https://issues.apache.org/jira/browse/PIG-1178
 Project: Pig
  Issue Type: Improvement
Reporter: Alan Gates
Assignee: Ying He
 Attachments: expressions-2.patch, expressions.patch, lp.patch, 
 lp.patch, pig_1178.patch, pig_1178.patch, PIG_1178.patch


 The current implementation of the logical plan and the logical optimizer in 
 Pig has proven to not be easily extensible. Developer feedback has indicated 
 that adding new rules to the optimizer is quite burdensome. In addition, the 
 logical plan has been an area of numerous bugs, many of which have been 
 difficult to fix. Developers also feel that the logical plan is difficult to 
 understand and maintain. The root cause for these issues is that a number of 
 design decisions that were made as part of the 0.2 rewrite of the front end 
 have now proven to be sub-optimal. The heart of this proposal is to revisit a 
 number of those proposals and rebuild the logical plan with a simpler design 
 that will make it much easier to maintain the logical plan as well as extend 
 the logical optimizer. 
 See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full 
 details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with

2010-02-10 Thread Ying He (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12832313#action_12832313
 ] 

Ying He commented on PIG-1178:
--

Here is my thoughts to use this framework to implement PruneColumns.

1. Separate prune columns and prune map keys into 2 rules. Current 
implementation mixed them in one class. It's better to separate them to make 
each rule simpler. 

2. The prune column rule can be implemented by creating a new visitor. This 
visitor is called from transform(), and it visits every 
LogicalRelationalOperator by reverse dependency order. Each 
visit(LogicalRelationalOperator) calculates the required output uids  by 
combining the input uids from it successors. If a node is the sink of the plan, 
the output uids are retrieved from its schema. The input uids are calculated 
from its output uids by looking into the expression plan(s) of this operator.  
If an output uid is derived from other uids, the source uids should be put into 
input uids. For example, a+b is from a  b. The input uids should keep the uid 
of a  b.   Each operator should consider its logical meanings when calculating 
input uids from output uids. For example, for LOCross, the input uids should 
contain at least one field from each input. 

The input uids and output uids can be added into the operator as annotations.

3. After step 2, use another visitor to go over the plan again by dependency 
order to prune the columns.  This can be done by reading out the input and 
output uids for each node.

4. I think it's ok to implement prune column and prune map key as regular rule. 
They just need to overwrite the match().

public ListOperatorPlan match(OperatorPlan plan) {
ListOperatorPlan ll = new ArrayListOperatorPlan();
ll.add(plan);
return ll;
}

This method tells optimizer that only one match is find, which is the plan 
itself.

5. For Transformer class, I suggest to get rid of check() and change void 
transform() into  boolean transform().   If transform() returns false, it means 
no transformation is made. If it returns true, transformation is made. The 
reason is that for some rules, it is not easy to know if a change is going to 
be made, such as PruneColumn rule.   If we have both check() and transform(), 
lots of logic would be duplicated in these two methods.

 LogicalPlan and Optimizer are too complex and hard to work with
 ---

 Key: PIG-1178
 URL: https://issues.apache.org/jira/browse/PIG-1178
 Project: Pig
  Issue Type: Improvement
Reporter: Alan Gates
Assignee: Ying He
 Attachments: expressions-2.patch, expressions.patch, lp.patch, 
 lp.patch, pig_1178.patch, pig_1178.patch, PIG_1178.patch


 The current implementation of the logical plan and the logical optimizer in 
 Pig has proven to not be easily extensible. Developer feedback has indicated 
 that adding new rules to the optimizer is quite burdensome. In addition, the 
 logical plan has been an area of numerous bugs, many of which have been 
 difficult to fix. Developers also feel that the logical plan is difficult to 
 understand and maintain. The root cause for these issues is that a number of 
 design decisions that were made as part of the 0.2 rewrite of the front end 
 have now proven to be sub-optimal. The heart of this proposal is to revisit a 
 number of those proposals and rebuild the logical plan with a simpler design 
 that will make it much easier to maintain the logical plan as well as extend 
 the logical optimizer. 
 See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full 
 details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with

2010-02-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12832328#action_12832328
 ] 

Hadoop QA commented on PIG-1178:


-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12435054/pig_1178.patch
  against trunk revision 908324.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 34 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated 1 warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

-1 release audit.  The applied patch generated 489 release audit warnings 
(more than the trunk's current 487 warnings).

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/199/testReport/
Release audit warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/199/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/199/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/199/console

This message is automatically generated.

 LogicalPlan and Optimizer are too complex and hard to work with
 ---

 Key: PIG-1178
 URL: https://issues.apache.org/jira/browse/PIG-1178
 Project: Pig
  Issue Type: Improvement
Reporter: Alan Gates
Assignee: Ying He
 Attachments: expressions-2.patch, expressions.patch, lp.patch, 
 lp.patch, pig_1178.patch, pig_1178.patch, PIG_1178.patch


 The current implementation of the logical plan and the logical optimizer in 
 Pig has proven to not be easily extensible. Developer feedback has indicated 
 that adding new rules to the optimizer is quite burdensome. In addition, the 
 logical plan has been an area of numerous bugs, many of which have been 
 difficult to fix. Developers also feel that the logical plan is difficult to 
 understand and maintain. The root cause for these issues is that a number of 
 design decisions that were made as part of the 0.2 rewrite of the front end 
 have now proven to be sub-optimal. The heart of this proposal is to revisit a 
 number of those proposals and rebuild the logical plan with a simpler design 
 that will make it much easier to maintain the logical plan as well as extend 
 the logical optimizer. 
 See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full 
 details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with

2010-02-09 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12831886#action_12831886
 ] 

Ashutosh Chauhan commented on PIG-1178:
---

 Was wondering  about different optimizations that we do on a complied MR plan. 
Not sure if its already been discussed or is in some doc. But essentially those 
optimizations are also done through visitors and would benefit greatly if there 
is a framework for them just as there is one for front-end. Is there any plan 
to also subsume those visitors (possibly by rewriting them as rule-transform 
pairs) in this new optimizer or will they be dealt with separately later on?  

 LogicalPlan and Optimizer are too complex and hard to work with
 ---

 Key: PIG-1178
 URL: https://issues.apache.org/jira/browse/PIG-1178
 Project: Pig
  Issue Type: Improvement
Reporter: Alan Gates
Assignee: Ying He
 Attachments: expressions-2.patch, expressions.patch, lp.patch, 
 lp.patch, pig_1178.patch, pig_1178.patch, PIG_1178.patch


 The current implementation of the logical plan and the logical optimizer in 
 Pig has proven to not be easily extensible. Developer feedback has indicated 
 that adding new rules to the optimizer is quite burdensome. In addition, the 
 logical plan has been an area of numerous bugs, many of which have been 
 difficult to fix. Developers also feel that the logical plan is difficult to 
 understand and maintain. The root cause for these issues is that a number of 
 design decisions that were made as part of the 0.2 rewrite of the front end 
 have now proven to be sub-optimal. The heart of this proposal is to revisit a 
 number of those proposals and rebuild the logical plan with a simpler design 
 that will make it much easier to maintain the logical plan as well as extend 
 the logical optimizer. 
 See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full 
 details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with

2010-02-04 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12829759#action_12829759
 ] 

Alan Gates commented on PIG-1178:
-

Comments that came out of a review of the twiki doc the pig team did:

1) In OperatorPlan, the use of roots and leaves in the graph was considered 
confusing.  Some people view roots as sources and some as sinks.  It was 
recommended that we switch roots to sources and leaves to sinks to avoid 
confusion.

2) The new OperatorPlan does not include mergeSharedPlan, which was used by 
multi-query functionality in the old plan.  After further investigation I found 
that merge is currently only used by multi-query for physical plans.  While 
ideally we would like to use this infrastructure for physical plans too, I feel 
it is reasonable to put off adding merge until at least the initial prototyping 
phase is done.  After briefling looking at it I see no reason why it should not 
work, though we may need a more precise way to decide when two nodes are the 
same and should be merged.

3) A point was raised that perhaps the optimizer should reset the annotations 
on the nodes after a transform and all the attached listeners have been run.  
With further thought, I don't think so, as there may be annotations we want to 
last across transforms.  For example, a rule that could match an infinite 
number of times may want to sign a node to note it's already been there so 
that it does not fire on the node again.  The easiest way to do this signing 
would be with the annotations.  However, I can see that there would be a desire 
to clear certain annotations so that each pass of the optimizer has a fresh 
state.  To accomplish this I was wondering if we should allow developers to add 
visitors that would be run after all the listeners run.  So PlanOptimizer would 
change to have a new method:

{code}
addStatusResettingVisitor(Visitor v) {
resetters.add(v);
}
{code}

and in the optimize loop

{code}
for (OperatorPlan m : matches) {
if (transformer.check(m)) {
sawMatch = true;
transformer.transform(m);
for(PlanTransformListener l: listeners) {
l.transformed(plan, transformer.reportChanges());
}
}
}
{code}

would change to be:

{code}

for (OperatorPlan m : matches) {
if (transformer.check(m)) {
sawMatch = true;
transformer.transform(m);
for(PlanTransformListener l: listeners) {
l.transformed(plan, transformer.reportChanges());
}
for(Visitor v : resetters) {
v.visit();
}
}
}
{code}

Thoughts?

4) There is not clarity on how column pruning will work in the new optimizer.  
Will it be represented by a rule?  If so, how, since the new optimizer does not 
allow matching on any operator just on specific operators?  Would it be better 
instead to have it use the Transformers but not the PlanOptimizer 
infrastructure, since it isn't clear that we would want the column pruning rule 
to be triggered more than once?  To answer these I think we should prototype 
the column pruning soon.  It was one of the hardest parts of the existing 
infrastructure.  We want to make sure it can be done well in this new approach 
before committing to the approach.

5) The comment was made that while the examples in the document appear to show 
that the proposal will work for nested plans (that is, inner plans in foreach) 
they do not show that it will work for operators not yet nestable in foreach 
(e.g. group, foreach).  Since a stated goal of Pig Latin is to someday allow 
arbitrary nesting, we should validate that the proposal will support these 
additional operators to be nested in foreach.


 LogicalPlan and Optimizer are too complex and hard to work with
 ---

 Key: PIG-1178
 URL: https://issues.apache.org/jira/browse/PIG-1178
 Project: Pig
  Issue Type: Improvement
Reporter: Alan Gates
Assignee: Ying He
 Attachments: expressions-2.patch, expressions.patch, lp.patch, 
 lp.patch, pig_1178.patch, PIG_1178.patch


 The current implementation of the logical plan and the logical optimizer in 
 Pig has proven to not be easily extensible. Developer feedback has indicated 
 that adding new rules to the optimizer is quite burdensome. In addition, the 
 logical plan has been an area of numerous bugs, many of which have been 
 difficult to fix. Developers also feel that the logical plan is difficult to 
 understand and maintain. The root cause for these issues is that a number of 
 design decisions that were made as part of the 0.2 rewrite of the front end 
 have now proven to be sub-optimal. The heart of this proposal is to revisit a 
 number of those proposals and rebuild the logical plan with a simpler design 
 that will make it much easier 

[jira] Commented: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with

2010-01-22 Thread Ying He (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12803856#action_12803856
 ] 

Ying He commented on PIG-1178:
--

Alan,thanks for the review.

for 6), the predecessor of LOFilter would be LOJoin,  so all projections would 
have input number  0.  My algorithm is to get field names from column number. 
The field names after join would be like A::id, B::id,   And findCommon() is to 
search for the longest prefix of these fields, to push filter to be after that 
alias. For example, if field names are A::id, and A::value, the filter is 
pushed after A.  if field names are D::A::id and D::A::value, the filter can be 
pushed after D, then pushed further to be after A.

 LogicalPlan and Optimizer are too complex and hard to work with
 ---

 Key: PIG-1178
 URL: https://issues.apache.org/jira/browse/PIG-1178
 Project: Pig
  Issue Type: Improvement
Reporter: Alan Gates
Assignee: Ying He
 Attachments: expressions-2.patch, expressions.patch, lp.patch, 
 lp.patch, PIG_1178.patch


 The current implementation of the logical plan and the logical optimizer in 
 Pig has proven to not be easily extensible. Developer feedback has indicated 
 that adding new rules to the optimizer is quite burdensome. In addition, the 
 logical plan has been an area of numerous bugs, many of which have been 
 difficult to fix. Developers also feel that the logical plan is difficult to 
 understand and maintain. The root cause for these issues is that a number of 
 design decisions that were made as part of the 0.2 rewrite of the front end 
 have now proven to be sub-optimal. The heart of this proposal is to revisit a 
 number of those proposals and rebuild the logical plan with a simpler design 
 that will make it much easier to maintain the logical plan as well as extend 
 the logical optimizer. 
 See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full 
 details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with

2010-01-21 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12803565#action_12803565
 ] 

Alan Gates commented on PIG-1178:
-

Comments on lp.patch:

1) In LOJoin.getSchema, these lines of code:

{code}
for (Operator op : inputs) {
LogicalSchema inputSchema = ((LogicalRelationalOperator)op).getSchema();
// the schema of one input is unknown, so the join schema is unknown, just 
return 
if (inputSchema == null) {
schemaSet = true;
return schema;
}
{code}

You are assuming that schema is null.  It would be better to explicitly set 
schema to null and then return it.

2) In SplitFilter.transform you put it in a while loop, finding each 'and' and 
splitting it into another filter.  But there's already an outer while loop (the 
one in the optimizer applying the rule over and over) that will do that.  One 
of the assertions in this design is that each rule should be as simple as 
possible.  This rule should just split one and, and let the next application of 
the rule find the next and and split it again.

Same comment applies to MergeFilter.transform and to 
PushUpFilterTransformer.check and .transform.

3) In MergeFilter.check:  IIRC implicit splits aren't inserted into the plan 
until the logical to physical transformation.  So it's possible that a filter 
actually has multiple successors.  So instead of:

{code}
if (succeds != null  succeds.size()0) {
if (succeds.get(0) instanceof LOFilter) {
return true;
}
}
{code}

it should read

{code}
if (succeds != null  succeds.size() == 1) {
if (succeds.get(0) instanceof LOFilter) {
return true;
}
}
{code}

4) In MergeFilter.combineFilterCond:  The expressions have been written in such 
a way that they manage their own connections when they are created.  See for 
example, AndExpression.  In its constructor it takes it add itself to the 
expression plan and connects itself to its two operands.  So there no need to 
to do the addPlan.add and addPlan.connect calls.

5) In PushUpFilterTransformer.check, you need to check that the join type is 
inner.  Pushing past outer joins is much trickier, and need not be handled here.

6) In PushUpFilterTransformer.check I don't understand what findCommon is 
doing.  In any case, it should not be paying attention to aliases.  It should 
be using the inputNums from the projection.  It should be checking that all 
projections in the filter are associated with the same inputNum.  If so, it is 
pushable to that inputNum.  If not, not.  In the same way transform should be 
using inputNum to find the right predecessor, not aliases.

7) We need a fourth rule to handle swapping filters, so each one can be tried 
against the join.  Since this rule will always pass check (it would just be two 
filters in a row) we need a way to check that it doesn't run more than twice 
for a given pair of filters.  We can accomplish this by having it 'sign' each 
filter in the node each time it is applied.  This is what the annotate call on 
Operator is for.  So each time the transform is applied, it would annotate both 
filters with info that it was applied, and to which filters.  Then part of 
check can be two check that this rule has been applied at most twice.


 LogicalPlan and Optimizer are too complex and hard to work with
 ---

 Key: PIG-1178
 URL: https://issues.apache.org/jira/browse/PIG-1178
 Project: Pig
  Issue Type: Improvement
Reporter: Alan Gates
Assignee: Ying He
 Attachments: expressions-2.patch, expressions.patch, lp.patch, 
 lp.patch, PIG_1178.patch


 The current implementation of the logical plan and the logical optimizer in 
 Pig has proven to not be easily extensible. Developer feedback has indicated 
 that adding new rules to the optimizer is quite burdensome. In addition, the 
 logical plan has been an area of numerous bugs, many of which have been 
 difficult to fix. Developers also feel that the logical plan is difficult to 
 understand and maintain. The root cause for these issues is that a number of 
 design decisions that were made as part of the 0.2 rewrite of the front end 
 have now proven to be sub-optimal. The heart of this proposal is to revisit a 
 number of those proposals and rebuild the logical plan with a simpler design 
 that will make it much easier to maintain the logical plan as well as extend 
 the logical optimizer. 
 See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full 
 details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with

2010-01-14 Thread Ying He (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12800289#action_12800289
 ] 

Ying He commented on PIG-1178:
--

+1

 LogicalPlan and Optimizer are too complex and hard to work with
 ---

 Key: PIG-1178
 URL: https://issues.apache.org/jira/browse/PIG-1178
 Project: Pig
  Issue Type: Improvement
Reporter: Alan Gates
Assignee: Ying He
 Attachments: expressions.patch, lp.patch, PIG_1178.patch


 The current implementation of the logical plan and the logical optimizer in 
 Pig has proven to not be easily extensible. Developer feedback has indicated 
 that adding new rules to the optimizer is quite burdensome. In addition, the 
 logical plan has been an area of numerous bugs, many of which have been 
 difficult to fix. Developers also feel that the logical plan is difficult to 
 understand and maintain. The root cause for these issues is that a number of 
 design decisions that were made as part of the 0.2 rewrite of the front end 
 have now proven to be sub-optimal. The heart of this proposal is to revisit a 
 number of those proposals and rebuild the logical plan with a simpler design 
 that will make it much easier to maintain the logical plan as well as extend 
 the logical optimizer. 
 See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full 
 details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with

2010-01-14 Thread Ying He (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12800353#action_12800353
 ] 

Ying He commented on PIG-1178:
--

to answer Daniel's questions:

. In Rule.match, is PatternMatchOperatorPlan only contains leave nodes but not 
edge information? If so, instead of saying A list of all matched sub-plans, 
can we put more details in the comments?

The returned lists are plans. You can call getPredecessors() or getSuccessors() 
on any node in the plan. The implementation doesn't keep edge information, it 
calls the base plan for this information and return the operators that are in 
this sub-plan. So looking from outside, it is a plan, it's just read-only, and 
method to update the plan would throw an exception.

 LogicalPlan and Optimizer are too complex and hard to work with
 ---

 Key: PIG-1178
 URL: https://issues.apache.org/jira/browse/PIG-1178
 Project: Pig
  Issue Type: Improvement
Reporter: Alan Gates
Assignee: Ying He
 Attachments: expressions-2.patch, expressions.patch, lp.patch, 
 PIG_1178.patch


 The current implementation of the logical plan and the logical optimizer in 
 Pig has proven to not be easily extensible. Developer feedback has indicated 
 that adding new rules to the optimizer is quite burdensome. In addition, the 
 logical plan has been an area of numerous bugs, many of which have been 
 difficult to fix. Developers also feel that the logical plan is difficult to 
 understand and maintain. The root cause for these issues is that a number of 
 design decisions that were made as part of the 0.2 rewrite of the front end 
 have now proven to be sub-optimal. The heart of this proposal is to revisit a 
 number of those proposals and rebuild the logical plan with a simpler design 
 that will make it much easier to maintain the logical plan as well as extend 
 the logical optimizer. 
 See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full 
 details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with

2010-01-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12800434#action_12800434
 ] 

Hadoop QA commented on PIG-1178:


+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12430285/expressions-2.patch
  against trunk revision 898497.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/175/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/175/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/175/console

This message is automatically generated.

 LogicalPlan and Optimizer are too complex and hard to work with
 ---

 Key: PIG-1178
 URL: https://issues.apache.org/jira/browse/PIG-1178
 Project: Pig
  Issue Type: Improvement
Reporter: Alan Gates
Assignee: Ying He
 Attachments: expressions-2.patch, expressions.patch, lp.patch, 
 PIG_1178.patch


 The current implementation of the logical plan and the logical optimizer in 
 Pig has proven to not be easily extensible. Developer feedback has indicated 
 that adding new rules to the optimizer is quite burdensome. In addition, the 
 logical plan has been an area of numerous bugs, many of which have been 
 difficult to fix. Developers also feel that the logical plan is difficult to 
 understand and maintain. The root cause for these issues is that a number of 
 design decisions that were made as part of the 0.2 rewrite of the front end 
 have now proven to be sub-optimal. The heart of this proposal is to revisit a 
 number of those proposals and rebuild the logical plan with a simpler design 
 that will make it much easier to maintain the logical plan as well as extend 
 the logical optimizer. 
 See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full 
 details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with

2010-01-14 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12800479#action_12800479
 ] 

Alan Gates commented on PIG-1178:
-

I've checked in the expressions-2.patch.  I'll flesh out LogicalSchema in a 
separate patch.



 LogicalPlan and Optimizer are too complex and hard to work with
 ---

 Key: PIG-1178
 URL: https://issues.apache.org/jira/browse/PIG-1178
 Project: Pig
  Issue Type: Improvement
Reporter: Alan Gates
Assignee: Ying He
 Attachments: expressions-2.patch, expressions.patch, lp.patch, 
 PIG_1178.patch


 The current implementation of the logical plan and the logical optimizer in 
 Pig has proven to not be easily extensible. Developer feedback has indicated 
 that adding new rules to the optimizer is quite burdensome. In addition, the 
 logical plan has been an area of numerous bugs, many of which have been 
 difficult to fix. Developers also feel that the logical plan is difficult to 
 understand and maintain. The root cause for these issues is that a number of 
 design decisions that were made as part of the 0.2 rewrite of the front end 
 have now proven to be sub-optimal. The heart of this proposal is to revisit a 
 number of those proposals and rebuild the logical plan with a simpler design 
 that will make it much easier to maintain the logical plan as well as extend 
 the logical optimizer. 
 See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full 
 details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with

2010-01-13 Thread Ying He (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799859#action_12799859
 ] 

Ying He commented on PIG-1178:
--

a couple questions on expression operators:

1. in ProjectExpression, is it better to change the object variable input 
from int to LogicalRelationalOperator to point to the operator that the 
project expression operates on directly?  And I don't understand why this 
operator needs alias  it references. But if we change input to operator object, 
the alias can be get from the operator.

2. I don't know the purpose of ColumnExpression. Is it to capture operands?  It 
doesn't seem to have any special features. So I am not sure if it is necessary


 LogicalPlan and Optimizer are too complex and hard to work with
 ---

 Key: PIG-1178
 URL: https://issues.apache.org/jira/browse/PIG-1178
 Project: Pig
  Issue Type: Improvement
Reporter: Alan Gates
Assignee: Ying He
 Attachments: expressions.patch, lp.patch, PIG_1178.patch


 The current implementation of the logical plan and the logical optimizer in 
 Pig has proven to not be easily extensible. Developer feedback has indicated 
 that adding new rules to the optimizer is quite burdensome. In addition, the 
 logical plan has been an area of numerous bugs, many of which have been 
 difficult to fix. Developers also feel that the logical plan is difficult to 
 understand and maintain. The root cause for these issues is that a number of 
 design decisions that were made as part of the 0.2 rewrite of the front end 
 have now proven to be sub-optimal. The heart of this proposal is to revisit a 
 number of those proposals and rebuild the logical plan with a simpler design 
 that will make it much easier to maintain the logical plan as well as extend 
 the logical optimizer. 
 See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full 
 details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with

2010-01-13 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12800054#action_12800054
 ] 

Alan Gates commented on PIG-1178:
-

bq. in ProjectExpression, is it better to change the object variable input 
from int to LogicalRelationalOperator to point to the operator that the 
project expression operates on directly?
I want to avoid references to the actual relational operators in the 
expressions because it makes patching up the plans after a transformation much 
easier.  If each project keeps a reference to the relational operator, then 
when the plan is transformed we have to go to every project and change its 
reference.  By keeping pointers only to which input number, we don't have to 
make any changes in the projects after a transformation in the plan.

bq. And I don't understand why this operator needs alias it references. But if 
we change input to operator object, the alias can be get from the operator.
You're right, we shouldn't double store aliases here.  We should just use the 
uid and the project reference.  I'll make the change.

bq. I don't know the purpose of ColumnExpression. Is it to capture operands? It 
doesn't seem to have any special features. So I am not sure if it is necessary
It's a super class for all expressions that represent a single value:  
projection, constants, and eventually, tuple and map dereferences.  I think 
it's useful for understanding the categorization of expressions.  I'm not sure 
it adds any functionality.


 LogicalPlan and Optimizer are too complex and hard to work with
 ---

 Key: PIG-1178
 URL: https://issues.apache.org/jira/browse/PIG-1178
 Project: Pig
  Issue Type: Improvement
Reporter: Alan Gates
Assignee: Ying He
 Attachments: expressions.patch, lp.patch, PIG_1178.patch


 The current implementation of the logical plan and the logical optimizer in 
 Pig has proven to not be easily extensible. Developer feedback has indicated 
 that adding new rules to the optimizer is quite burdensome. In addition, the 
 logical plan has been an area of numerous bugs, many of which have been 
 difficult to fix. Developers also feel that the logical plan is difficult to 
 understand and maintain. The root cause for these issues is that a number of 
 design decisions that were made as part of the 0.2 rewrite of the front end 
 have now proven to be sub-optimal. The heart of this proposal is to revisit a 
 number of those proposals and rebuild the logical plan with a simpler design 
 that will make it much easier to maintain the logical plan as well as extend 
 the logical optimizer. 
 See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full 
 details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with

2010-01-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12800125#action_12800125
 ] 

Hadoop QA commented on PIG-1178:


-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12430202/expressions.patch
  against trunk revision 898497.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated 1 warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/174/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/174/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/174/console

This message is automatically generated.

 LogicalPlan and Optimizer are too complex and hard to work with
 ---

 Key: PIG-1178
 URL: https://issues.apache.org/jira/browse/PIG-1178
 Project: Pig
  Issue Type: Improvement
Reporter: Alan Gates
Assignee: Ying He
 Attachments: expressions.patch, lp.patch, PIG_1178.patch


 The current implementation of the logical plan and the logical optimizer in 
 Pig has proven to not be easily extensible. Developer feedback has indicated 
 that adding new rules to the optimizer is quite burdensome. In addition, the 
 logical plan has been an area of numerous bugs, many of which have been 
 difficult to fix. Developers also feel that the logical plan is difficult to 
 understand and maintain. The root cause for these issues is that a number of 
 design decisions that were made as part of the 0.2 rewrite of the front end 
 have now proven to be sub-optimal. The heart of this proposal is to revisit a 
 number of those proposals and rebuild the logical plan with a simpler design 
 that will make it much easier to maintain the logical plan as well as extend 
 the logical optimizer. 
 See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full 
 details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with

2010-01-12 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799371#action_12799371
 ] 

Alan Gates commented on PIG-1178:
-

This code didn't make any changes in the distinct bag area.  Also, these same 
failures have been seen on the load-store branch.  And running the test locally 
does not reproduce the failure.  So I suspect there is some intermittent bug 
with the bags rather than an issue with this code.  I'm going to go ahead and 
check it in.

 LogicalPlan and Optimizer are too complex and hard to work with
 ---

 Key: PIG-1178
 URL: https://issues.apache.org/jira/browse/PIG-1178
 Project: Pig
  Issue Type: Improvement
Reporter: Alan Gates
Assignee: Ying He
 Attachments: lp.patch, PIG_1178.patch


 The current implementation of the logical plan and the logical optimizer in 
 Pig has proven to not be easily extensible. Developer feedback has indicated 
 that adding new rules to the optimizer is quite burdensome. In addition, the 
 logical plan has been an area of numerous bugs, many of which have been 
 difficult to fix. Developers also feel that the logical plan is difficult to 
 understand and maintain. The root cause for these issues is that a number of 
 design decisions that were made as part of the 0.2 rewrite of the front end 
 have now proven to be sub-optimal. The heart of this proposal is to revisit a 
 number of those proposals and rebuild the logical plan with a simpler design 
 that will make it much easier to maintain the logical plan as well as extend 
 the logical optimizer. 
 See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full 
 details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with

2010-01-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799552#action_12799552
 ] 

Hadoop QA commented on PIG-1178:


-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12430056/expressions.patch
  against trunk revision 898497.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated 1 warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 1 new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/171/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/171/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/171/console

This message is automatically generated.

 LogicalPlan and Optimizer are too complex and hard to work with
 ---

 Key: PIG-1178
 URL: https://issues.apache.org/jira/browse/PIG-1178
 Project: Pig
  Issue Type: Improvement
Reporter: Alan Gates
Assignee: Ying He
 Attachments: expressions.patch, lp.patch, PIG_1178.patch


 The current implementation of the logical plan and the logical optimizer in 
 Pig has proven to not be easily extensible. Developer feedback has indicated 
 that adding new rules to the optimizer is quite burdensome. In addition, the 
 logical plan has been an area of numerous bugs, many of which have been 
 difficult to fix. Developers also feel that the logical plan is difficult to 
 understand and maintain. The root cause for these issues is that a number of 
 design decisions that were made as part of the 0.2 rewrite of the front end 
 have now proven to be sub-optimal. The heart of this proposal is to revisit a 
 number of those proposals and rebuild the logical plan with a simpler design 
 that will make it much easier to maintain the logical plan as well as extend 
 the logical optimizer. 
 See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full 
 details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with

2010-01-11 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12798855#action_12798855
 ] 

Daniel Dai commented on PIG-1178:
-

Couple of questions:
1. Why do we need a pos arguments in PlanEdge? What's the use case for that?
2. Where will relational operator methods go? Such as getRequiredFields, 
getProjectionMap, getRelevantInputs, pruneColumns. Are we going to solve them 
using uid?
3. What is the functional division between Rule.match() and 
PatternMatchOperatorPlan.check()? Can we wrap both logic in one class (Rule) 
rather than two? Leave PatternMatchOperatorPlan simple seems to more clear to 
the rule writers.

 LogicalPlan and Optimizer are too complex and hard to work with
 ---

 Key: PIG-1178
 URL: https://issues.apache.org/jira/browse/PIG-1178
 Project: Pig
  Issue Type: Improvement
Reporter: Alan Gates
 Attachments: lp.patch


 The current implementation of the logical plan and the logical optimizer in 
 Pig has proven to not be easily extensible. Developer feedback has indicated 
 that adding new rules to the optimizer is quite burdensome. In addition, the 
 logical plan has been an area of numerous bugs, many of which have been 
 difficult to fix. Developers also feel that the logical plan is difficult to 
 understand and maintain. The root cause for these issues is that a number of 
 design decisions that were made as part of the 0.2 rewrite of the front end 
 have now proven to be sub-optimal. The heart of this proposal is to revisit a 
 number of those proposals and rebuild the logical plan with a simpler design 
 that will make it much easier to maintain the logical plan as well as extend 
 the logical optimizer. 
 See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full 
 details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with

2010-01-11 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12798872#action_12798872
 ] 

Alan Gates commented on PIG-1178:
-

bq. 1. Why do we need a pos arguments in PlanEdge? What's the use case for that?
One of the issues we saw frequently with the current implementation of 
OperatorPlan is that for nodes with multiple inputs (or outputs), if a 
transformation required disconnecting one of those inputs and connecting a new 
one, it often changed the order of the inputs (that is, what had been 
plan.getPredecessors(op).get(1) became plan.getPredecessors(op).get(1)).  The 
ability to connect a PlanEdge as a particular input or output is meant to 
address this.

bq.  2. Where will relational operator methods go? Such as getRequiredFields, 
getProjectionMap, getRelevantInputs, pruneColumns. Are we going to solve them 
using uid?
They should go away.  Patching up a plan after the transform will be the 
responsibility of the PlanTransformListeners.  The hypothesis is that schema 
plus uid will be sufficient for these to do their jobs.

pruneColumns is a special case, but again I think that schema plus uid will be 
sufficient.

bq. 3. What is the functional division between Rule.match() and 
PatternMatchOperatorPlan.check()? Can we wrap both logic in one class (Rule) 
rather than two? Leave PatternMatchOperatorPlan simple seems to more clear to 
the rule writers.
I'll leave this one for Ying.

 LogicalPlan and Optimizer are too complex and hard to work with
 ---

 Key: PIG-1178
 URL: https://issues.apache.org/jira/browse/PIG-1178
 Project: Pig
  Issue Type: Improvement
Reporter: Alan Gates
 Attachments: lp.patch


 The current implementation of the logical plan and the logical optimizer in 
 Pig has proven to not be easily extensible. Developer feedback has indicated 
 that adding new rules to the optimizer is quite burdensome. In addition, the 
 logical plan has been an area of numerous bugs, many of which have been 
 difficult to fix. Developers also feel that the logical plan is difficult to 
 understand and maintain. The root cause for these issues is that a number of 
 design decisions that were made as part of the 0.2 rewrite of the front end 
 have now proven to be sub-optimal. The heart of this proposal is to revisit a 
 number of those proposals and rebuild the logical plan with a simpler design 
 that will make it much easier to maintain the logical plan as well as extend 
 the logical optimizer. 
 See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full 
 details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with

2010-01-11 Thread Ying He (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12798880#action_12798880
 ] 

Ying He commented on PIG-1178:
--

the Rule.match() finds a potential match and delegate to 
PatternMatchOperatorPlan to further verify if it matches the pattern. During 
it's verification, the plan is filled with the operators from original plan.  
The PetternMatchOperatorPlan is not visible to rule writers. Rule writers 
should only use OperatorPlan to operate on the matched sub-plans.

 LogicalPlan and Optimizer are too complex and hard to work with
 ---

 Key: PIG-1178
 URL: https://issues.apache.org/jira/browse/PIG-1178
 Project: Pig
  Issue Type: Improvement
Reporter: Alan Gates
Assignee: Ying He
 Attachments: lp.patch


 The current implementation of the logical plan and the logical optimizer in 
 Pig has proven to not be easily extensible. Developer feedback has indicated 
 that adding new rules to the optimizer is quite burdensome. In addition, the 
 logical plan has been an area of numerous bugs, many of which have been 
 difficult to fix. Developers also feel that the logical plan is difficult to 
 understand and maintain. The root cause for these issues is that a number of 
 design decisions that were made as part of the 0.2 rewrite of the front end 
 have now proven to be sub-optimal. The heart of this proposal is to revisit a 
 number of those proposals and rebuild the logical plan with a simpler design 
 that will make it much easier to maintain the logical plan as well as extend 
 the logical optimizer. 
 See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full 
 details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with

2010-01-11 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12798943#action_12798943
 ] 

Daniel Dai commented on PIG-1178:
-

Thanks for clarification. +1 for the change.

 LogicalPlan and Optimizer are too complex and hard to work with
 ---

 Key: PIG-1178
 URL: https://issues.apache.org/jira/browse/PIG-1178
 Project: Pig
  Issue Type: Improvement
Reporter: Alan Gates
Assignee: Ying He
 Attachments: lp.patch


 The current implementation of the logical plan and the logical optimizer in 
 Pig has proven to not be easily extensible. Developer feedback has indicated 
 that adding new rules to the optimizer is quite burdensome. In addition, the 
 logical plan has been an area of numerous bugs, many of which have been 
 difficult to fix. Developers also feel that the logical plan is difficult to 
 understand and maintain. The root cause for these issues is that a number of 
 design decisions that were made as part of the 0.2 rewrite of the front end 
 have now proven to be sub-optimal. The heart of this proposal is to revisit a 
 number of those proposals and rebuild the logical plan with a simpler design 
 that will make it much easier to maintain the logical plan as well as extend 
 the logical optimizer. 
 See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full 
 details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with

2010-01-11 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12798969#action_12798969
 ] 

Daniel Dai commented on PIG-1178:
-

Two minor observations:
1. PatternMatchOperatorPlan.parent does not  sound very meanful, shall we 
change the name to basePlan?
2. In Rule.match, is PatternMatchOperatorPlan only contains leave nodes but not 
edge information? If so, instead of saying A list of all matched sub-plans, 
can we put more details in the comments?

 LogicalPlan and Optimizer are too complex and hard to work with
 ---

 Key: PIG-1178
 URL: https://issues.apache.org/jira/browse/PIG-1178
 Project: Pig
  Issue Type: Improvement
Reporter: Alan Gates
Assignee: Ying He
 Attachments: lp.patch, PIG_1178.patch


 The current implementation of the logical plan and the logical optimizer in 
 Pig has proven to not be easily extensible. Developer feedback has indicated 
 that adding new rules to the optimizer is quite burdensome. In addition, the 
 logical plan has been an area of numerous bugs, many of which have been 
 difficult to fix. Developers also feel that the logical plan is difficult to 
 understand and maintain. The root cause for these issues is that a number of 
 design decisions that were made as part of the 0.2 rewrite of the front end 
 have now proven to be sub-optimal. The heart of this proposal is to revisit a 
 number of those proposals and rebuild the logical plan with a simpler design 
 that will make it much easier to maintain the logical plan as well as extend 
 the logical optimizer. 
 See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full 
 details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with

2010-01-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799071#action_12799071
 ] 

Hadoop QA commented on PIG-1178:


-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12429960/PIG_1178.patch
  against trunk revision 896951.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 6 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/170/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/170/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/170/console

This message is automatically generated.

 LogicalPlan and Optimizer are too complex and hard to work with
 ---

 Key: PIG-1178
 URL: https://issues.apache.org/jira/browse/PIG-1178
 Project: Pig
  Issue Type: Improvement
Reporter: Alan Gates
Assignee: Ying He
 Attachments: lp.patch, PIG_1178.patch


 The current implementation of the logical plan and the logical optimizer in 
 Pig has proven to not be easily extensible. Developer feedback has indicated 
 that adding new rules to the optimizer is quite burdensome. In addition, the 
 logical plan has been an area of numerous bugs, many of which have been 
 difficult to fix. Developers also feel that the logical plan is difficult to 
 understand and maintain. The root cause for these issues is that a number of 
 design decisions that were made as part of the 0.2 rewrite of the front end 
 have now proven to be sub-optimal. The heart of this proposal is to revisit a 
 number of those proposals and rebuild the logical plan with a simpler design 
 that will make it much easier to maintain the logical plan as well as extend 
 the logical optimizer. 
 See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full 
 details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with

2010-01-08 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12798112#action_12798112
 ] 

Alan Gates commented on PIG-1178:
-

Comments on lp.patch:

Is the OperatorPlan that is returned by Rule.Match distinct from the plan that 
the Transformer is operating on?  Will the nodes inside that plan be the same 
nodes (ie that same java objects) that are in the original plan passed to 
Rule.Match?  If the answer is yes and yes, then things will still work, but 
rule writers will have to be very careful when working with the plans.  The 
reason is that the plan returned by Rule.Match will not be the same plan as 
returned by Operator.getPlan() when called on the matched nodes.  And rules 
should use the original plan (the one returned by Operator.getPlan()) to 
navigate, as that is the real plan.  I don't see a better way to do this, so 
I'm ok with it.  But we will have to document it very well in the javadocs for 
writing rules.

In TestExperimentalRule I think we should add tests for two more patterns:
1. a join with two loads as predecessors
2. a spilt with two filters as successors

Also, since I wrote some of this code, we need to have another committers 
review it before I check it in.

 LogicalPlan and Optimizer are too complex and hard to work with
 ---

 Key: PIG-1178
 URL: https://issues.apache.org/jira/browse/PIG-1178
 Project: Pig
  Issue Type: Improvement
Reporter: Alan Gates
 Attachments: lp.patch


 The current implementation of the logical plan and the logical optimizer in 
 Pig has proven to not be easily extensible. Developer feedback has indicated 
 that adding new rules to the optimizer is quite burdensome. In addition, the 
 logical plan has been an area of numerous bugs, many of which have been 
 difficult to fix. Developers also feel that the logical plan is difficult to 
 understand and maintain. The root cause for these issues is that a number of 
 design decisions that were made as part of the 0.2 rewrite of the front end 
 have now proven to be sub-optimal. The heart of this proposal is to revisit a 
 number of those proposals and rebuild the logical plan with a simpler design 
 that will make it much easier to maintain the logical plan as well as extend 
 the logical optimizer. 
 See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full 
 details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with

2010-01-08 Thread Ying He (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12798273#action_12798273
 ] 

Ying He commented on PIG-1178:
--

yes, the operator plan that Rule.match returned has the same java objects as 
the original plan.  So as Alan said, it's a bit confusing. the plan returned by 
Rule.match is more similar to a view. 

the test cases for the two patterns will be added later.

 LogicalPlan and Optimizer are too complex and hard to work with
 ---

 Key: PIG-1178
 URL: https://issues.apache.org/jira/browse/PIG-1178
 Project: Pig
  Issue Type: Improvement
Reporter: Alan Gates
 Attachments: lp.patch


 The current implementation of the logical plan and the logical optimizer in 
 Pig has proven to not be easily extensible. Developer feedback has indicated 
 that adding new rules to the optimizer is quite burdensome. In addition, the 
 logical plan has been an area of numerous bugs, many of which have been 
 difficult to fix. Developers also feel that the logical plan is difficult to 
 understand and maintain. The root cause for these issues is that a number of 
 design decisions that were made as part of the 0.2 rewrite of the front end 
 have now proven to be sub-optimal. The heart of this proposal is to revisit a 
 number of those proposals and rebuild the logical plan with a simpler design 
 that will make it much easier to maintain the logical plan as well as extend 
 the logical optimizer. 
 See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full 
 details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.