RE: How to clone a logical plan ?

2009-11-05 Thread Santhosh Srinivasan
You have hit a bug. I think LOJoin has to be added to
LogicalPlanCloneHelper.java. Can you file a jira?

Thanks,
Santhosh

-Original Message-
From: Ashutosh Chauhan [mailto:ashutosh.chau...@gmail.com] 
Sent: Thursday, November 05, 2009 3:28 PM
To: pig-dev@hadoop.apache.org
Subject: How to clone a logical plan ?

Hi,

For our cost based optimizer for a given query plan we need to generate
alternative query plans and evaluate them based on their estimated cost.
As a result of that, I want to clone a logical plan. I thought
LogicalPlanCloner is meant for that, but it doesnt seem to work. I added
this simple test case in TestLogicalPlanBuilder.java

public void testLogicalPlanCloneHelper() throws
CloneNotSupportedException{
LogicalPlan lp = buildPlan(C = join ( load 'A') by $0, (load
'B') by $0;);
LogicalPlanCloner cloner = new LogicalPlanCloner(lp);
cloner.getClonedPlan();
}

and this fails with the following stacktrace:

java.lang.NullPointerException
at
org.apache.pig.impl.logicalLayer.LOVisitor.visit(LOVisitor.java:171)
at
org.apache.pig.impl.logicalLayer.PlanSetter.visit(PlanSetter.java:63)
at
org.apache.pig.impl.logicalLayer.LOJoin.visit(LOJoin.java:213)
at org.apache.pig.impl.logicalLayer.LOJoin.visit(LOJoin.java:45)
at
org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.ja
va:67)
at
org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.ja
va:69)
at
org.apache.pig.impl.plan.DepthFirstWalker.walk(DepthFirstWalker.java:50)
at
org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
at
org.apache.pig.impl.logicalLayer.LogicalPlanCloneHelper.getClonedPlan(Lo
gicalPlanCloneHelper.java:73)
at
org.apache.pig.impl.logicalLayer.LogicalPlanCloner.getClonedPlan(Logical
PlanCloner.java:46)
at
org.apache.pig.test.TestLogicalPlanBuilder.testLogicalPlanCloneHelper(Te
stLogicalPlanBuilder.java:2110)

I am debugging this, but wanted to ask if I have hit a bug here or if I
am doing something wrong?

Thanks,
Ashutosh


RE: How to clone a logical plan ?

2009-11-05 Thread Santhosh Srinivasan
If my memory serves me correctly, the logical plan cloning was
implemented (by me) for cloning inner plans for foreach. As such, the
top level plan cloning was never tested and some items are marked as
TODO (see visit methods for LOLoad, LOStore and LOStream).

If you want to use it as you mention in your test cases, then you need
to add code for cloning the LOLoad, LOStore, LOStream and LOJoin.

Santhosh


-Original Message-
From: Santhosh Srinivasan [mailto:s...@yahoo-inc.com] 
Sent: Thursday, November 05, 2009 4:04 PM
To: pig-dev@hadoop.apache.org
Subject: RE: How to clone a logical plan ?

You have hit a bug. I think LOJoin has to be added to
LogicalPlanCloneHelper.java. Can you file a jira?

Thanks,
Santhosh

-Original Message-
From: Ashutosh Chauhan [mailto:ashutosh.chau...@gmail.com]
Sent: Thursday, November 05, 2009 3:28 PM
To: pig-dev@hadoop.apache.org
Subject: How to clone a logical plan ?

Hi,

For our cost based optimizer for a given query plan we need to generate
alternative query plans and evaluate them based on their estimated cost.
As a result of that, I want to clone a logical plan. I thought
LogicalPlanCloner is meant for that, but it doesnt seem to work. I added
this simple test case in TestLogicalPlanBuilder.java

public void testLogicalPlanCloneHelper() throws
CloneNotSupportedException{
LogicalPlan lp = buildPlan(C = join ( load 'A') by $0, (load
'B') by $0;);
LogicalPlanCloner cloner = new LogicalPlanCloner(lp);
cloner.getClonedPlan();
}

and this fails with the following stacktrace:

java.lang.NullPointerException
at
org.apache.pig.impl.logicalLayer.LOVisitor.visit(LOVisitor.java:171)
at
org.apache.pig.impl.logicalLayer.PlanSetter.visit(PlanSetter.java:63)
at
org.apache.pig.impl.logicalLayer.LOJoin.visit(LOJoin.java:213)
at org.apache.pig.impl.logicalLayer.LOJoin.visit(LOJoin.java:45)
at
org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.ja
va:67)
at
org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.ja
va:69)
at
org.apache.pig.impl.plan.DepthFirstWalker.walk(DepthFirstWalker.java:50)
at
org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
at
org.apache.pig.impl.logicalLayer.LogicalPlanCloneHelper.getClonedPlan(Lo
gicalPlanCloneHelper.java:73)
at
org.apache.pig.impl.logicalLayer.LogicalPlanCloner.getClonedPlan(Logical
PlanCloner.java:46)
at
org.apache.pig.test.TestLogicalPlanBuilder.testLogicalPlanCloneHelper(Te
stLogicalPlanBuilder.java:2110)

I am debugging this, but wanted to ask if I have hit a bug here or if I
am doing something wrong?

Thanks,
Ashutosh


Re: How to clone a logical plan ?

2009-11-05 Thread Ashutosh Chauhan
Thanks, Santhosh for quick response and explaination. Saved few hours of
debugging :)

Ashutosh

On Thu, Nov 5, 2009 at 19:21, Santhosh Srinivasan s...@yahoo-inc.com wrote:

 If my memory serves me correctly, the logical plan cloning was
 implemented (by me) for cloning inner plans for foreach. As such, the
 top level plan cloning was never tested and some items are marked as
 TODO (see visit methods for LOLoad, LOStore and LOStream).

 If you want to use it as you mention in your test cases, then you need
 to add code for cloning the LOLoad, LOStore, LOStream and LOJoin.

 Santhosh


 -Original Message-
 From: Santhosh Srinivasan [mailto:s...@yahoo-inc.com]
 Sent: Thursday, November 05, 2009 4:04 PM
 To: pig-dev@hadoop.apache.org
 Subject: RE: How to clone a logical plan ?

 You have hit a bug. I think LOJoin has to be added to
 LogicalPlanCloneHelper.java. Can you file a jira?

 Thanks,
 Santhosh

 -Original Message-
 From: Ashutosh Chauhan [mailto:ashutosh.chau...@gmail.com]
 Sent: Thursday, November 05, 2009 3:28 PM
 To: pig-dev@hadoop.apache.org
 Subject: How to clone a logical plan ?

 Hi,

 For our cost based optimizer for a given query plan we need to generate
 alternative query plans and evaluate them based on their estimated cost.
 As a result of that, I want to clone a logical plan. I thought
 LogicalPlanCloner is meant for that, but it doesnt seem to work. I added
 this simple test case in TestLogicalPlanBuilder.java

public void testLogicalPlanCloneHelper() throws
 CloneNotSupportedException{
LogicalPlan lp = buildPlan(C = join ( load 'A') by $0, (load
 'B') by $0;);
LogicalPlanCloner cloner = new LogicalPlanCloner(lp);
cloner.getClonedPlan();
}

 and this fails with the following stacktrace:

 java.lang.NullPointerException
at
 org.apache.pig.impl.logicalLayer.LOVisitor.visit(LOVisitor.java:171)
at
 org.apache.pig.impl.logicalLayer.PlanSetter.visit(PlanSetter.java:63)
at
 org.apache.pig.impl.logicalLayer.LOJoin.visit(LOJoin.java:213)
at org.apache.pig.impl.logicalLayer.LOJoin.visit(LOJoin.java:45)
at
 org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.ja
 va:67)
at
 org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.ja
 va:69)
at
 org.apache.pig.impl.plan.DepthFirstWalker.walk(DepthFirstWalker.java:50)
at
 org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
at
 org.apache.pig.impl.logicalLayer.LogicalPlanCloneHelper.getClonedPlan(Lo
 gicalPlanCloneHelper.java:73)
at
 org.apache.pig.impl.logicalLayer.LogicalPlanCloner.getClonedPlan(Logical
 PlanCloner.java:46)
at
 org.apache.pig.test.TestLogicalPlanBuilder.testLogicalPlanCloneHelper(Te
 stLogicalPlanBuilder.java:2110)

 I am debugging this, but wanted to ask if I have hit a bug here or if I
 am doing something wrong?

 Thanks,
 Ashutosh



Re: How to clone a logical plan ?

2009-11-05 Thread Dmitriy Ryaboy
Richard,
The Load/Store redesign proposal has an interface that defines how
stats get represented; a loader that implements ResourceLoader will
pass statistics up into Pig, which will then take care of doing
whatever it needs to do with them. The specifics of how the stats get
loaded in by the loader are up to the implementation of the loader --
they can be read in from a metadata service, sampled on the fly,
stored in a metadata file, etc.

For simplicity, we are working with serialized JSON representations of
ResourceStatistics right now.

-Dmitriy

2009/11/6 RichardGUO Fei gladiato...@hotmail.com:

 Hi


Dmitriy,

 Thanks for sharing. I look forward to seeing your work. I implemented a 
 storage and want to connect Pig to my storage.
 In order to let the optimizer fully benefit from the histogram and the 
 side-information of my storage, I am thinking of
 implementing a cost-based optimizer.

 How do you plan to pass in the statistics? So let's say that your input file 
 is a plain-text log file, do you require the users to
 do a statistics themselves? Or do you plan to limit this to only certain 
 types of storage?

 Thanks,
 Richard

 Date: Thu, 5 Nov 2009 22:54:47 -0500
 Subject: Re: How to clone a logical plan ?
 From: dvrya...@gmail.com
 To: pig-dev@hadoop.apache.org

 At a high level, we are implementing the framework for propagating
 statistics between Pig operators, and using said statistics to make
 moderately intelligent decisions about Join types that should be used
 (unless they are specified by the user).  We do this in a fairly
 brute-force manner, by generating all alternative plans (that part is
 not working so hot right now, see subject) and costing them, choosing
 the global minimum (there is some pruning happening, but not as much
 as something like System R).  As far as relation order inside a given
 Join, we set that deterministically after choosing the join, as Pig
 has specific preferences for where the largest relation should go for
 a given join type.  Once we have join type selection working, other
 optimizations can be added -- the tricky part is making sure the
 costing functions can't produce drastically wrong results.

 All the work is happening at the logical layer, between the rule-based
 optimizer and LogToPhysTranslator.

 -D


 2009/11/5 RichardGUO Fei gladiato...@hotmail.com:
 
  Hi,
 
  I am also doing a cost-based optimizer. So I am interested in knowing some 
  of the specs that you are after.
 
  Thanks,
  Richard
 
  _
  上Windows Live 中国首页,下载Messenger2009安全版!
  http://www.windowslive.cn

 _
 上Windows Live 中国首页,下载Messenger2009安全版!
 http://www.windowslive.cn