[HACKERS] EXEC_EVALDEBUG debugging broken?

2007-07-24 Thread peter . trautmeier
Hi all,

I am using version 8.2.4 of the source and compiled it with
both OPTIMIZER_DEBUG and EXEC_EVALDEBUG enabled to take a look
at how quals are evaluated by the executor.

However, when I issue a query like

SELECT name FROM city WHERE population  10 LIMIT 10;

I get the following debug output from postgres:

After canonicalize_qual()
   {OPEXPR 
   :opno 97 
   :opfuncid 66 
   :opresulttype 16 
   :opretset false 
   :args (
  {VAR 
  :varno 1 
  :varattno 4 
  :vartype 23 
  :vartypmod -1 
  :varlevelsup 0 
  :varnoold 1 
  :varoattno 4
  }
  {CONST 
  :consttype 23 
  :constlen 4 
  :constbyval true 
  :constisnull false 
  :constvalue 4 [ -96 -122 1 0 ]
  }
   )
   }

RELOPTINFO (1): rows=1018 width=88
baserestrictinfo: city.population  10
path list:
SeqScan(1) rows=1018 cost=0.00..66.16

cheapest startup path:
SeqScan(1) rows=1018 cost=0.00..66.16

cheapest total path:
SeqScan(1) rows=1018 cost=0.00..66.16

WARNING:  could not dump unrecognized node type: 404
ExecQual: qual is (
   {
   }
)


WARNING:  could not dump unrecognized node type: 404
ExecQual: qual is (
   {
   }
)

... and many more of this WARNINGs.

What happens to the OpExpr on its way from canonicalize_qual() to ExecQual() 
that makes _outNode() stumble over it when it is encountered
 in ExecQual()?

Regards,
Peter

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] EXEC_EVALDEBUG debugging broken?

2007-07-24 Thread peter . trautmeier
Von: Tom Lane [EMAIL PROTECTED]
 [EMAIL PROTECTED] writes:
  WARNING:  could not dump unrecognized node type: 404
  ExecQual: qual is (
 {
 }
  )
 
 Yeah, that code is toast, we probably ought to remove it.  It hasn't
 worked since the changes to make the executor treat plan trees as
 read-only.  

Thanks Tom!

Interesting, what do you mean by Plan trees are 'read only' now? Is it the 
distinction between Plan trees and their corresponding PlanState nodes that 
indicate the 'read only' behaviour and the 'writeable' state of the Plan, 
respectively, that was introduced at that time?

 Making it work would require teaching outfuncs.c how to dump
 all the different expression state node types, which seems like more
 maintenance effort than is justified for debug support that no one uses.

Ok, but what type has this qual from my example that was once a OpExpr as soon 
as it arrives at ExecQual? It's obviously not a OpExpr - otherwise _outNode 
wouldn't stumble over it.
(Is there a way do get this type info with gdb's help?)

 I'd suggest using EXPLAIN VERBOSE instead, which will give you
 the same printout that this would have given you back when it did
 work, but only once instead of over again for each row.

Thanks, I hadn't seen the VERBOSE option before.

Regards,
Peter

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[HACKERS] Design: Escort info from WHERE clause to executor?

2007-07-24 Thread peter . trautmeier
Hi all,

I want to pass additional weight info from the WHERE clause to the executor and 
I hope someone can help me with this.

I accept clauses like the following

WHERE (foo='a'){1}
WHERE (foo='a'){1} OR (bar='b'){2}
WHERE ((foo='a'){1} OR (bar='b'){2})){42} OR (baz='c'){3}

where the {} takes an integer as a weight that is attached to the preceding 
(partial) condition.

In the executor, I need to access (1) the logical value of and (2) the weight 
associated with _each_ subexpression that was entered. (Getting the weight from 
the parser to the executor is in itself a journey it seems, as some expression 
types are created anew - and not copied - and lose their annotated weight over 
and over again.)

Furthermore I need the structure of OR to be preserved; the OR-of-OR structure 
from the last WHERE must be preserved or at least be  reconstructible and must 
not be folded into a 3-valued OR (as canonicalize_qual and friends do.)

To sum up, I am looking for a (decently efficient) scheme that is able to

(1) pass arbitrary conditional expressions from WHERE to the executor in a 
structure preserving way. 
(2) annotate arbitrary expressions with weights that survive on its way from 
the parser to the executor.
(3) access the logical value of particular subexpressions.

I have some basic ideas how at least some of the requirements might be 
achieved. But as I am not totally satisfied with my ideas I hope you can 
provide me with some fresh input.

ANY ideas are welcome.

Regards,
Peter

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] EXEC_EVALDEBUG debugging broken?

2007-07-25 Thread peter . trautmeier

Von: Tom Lane [EMAIL PROTECTED]
 Yeah, exactly.  ExecInitExpr builds an ExprState tree that mirrors the
 structure of the Expr tree but contains all the run-time-variable data.
 This tree is what's now being passed to ExecQual.

I see, and ExecInitExpr wraps the OpExpr in an FuncExprState.

Is it possible to store the calculated logical value of certain expressions, 
e.g. boolean OpExprs, in their ExprState on a per tuple basis to reuse them 
later?
(I guess I described some kind of 'condition cache' here.)

 The problem is that outfuncs.c knows about all the Expr node types and
 none of the ExprState types, there being no need to dump the latter in
 normal use.  There is a valid argument that we ought to support dumping
 PlanState and ExprState trees for debugging purposes, but it just seems
 like more maintenance effort than it's worth ...

I don't know how often these ExprState nodes really change and how much 
maintenance they require, but I get your point.

  (Is there a way do get this type info with gdb's help?)
 
 p *(Node *) ptr ought to do it.

Thanks, that works.

Regards,
Peter

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Design: Escort info from WHERE clause to executor?

2007-07-25 Thread peter . trautmeier
Thanks imad,

Von: imad [EMAIL PROTECTED]
 It looks like you need a customized version of AExpr Node.
 In the backend parser, an AExpr Node is constructed against each given
 WHERE expression. You can store the weight along with the expression.
 Further, don't forget to upgrade the copy functions and equal
 functions for AExpr if you want to take this weight value all the way
 upto the executor.

I have already done that, alas, it doesn't suffice: During parse analysis 
transformExpr() turns the AExpr into other types, e.g. an OpExpr in case of foo 
= 42.

These newly created OpExprs won't last until they reach the executor in every 
case, though: They are sometimes recreated anew during planning 'manually', 
i.e. not by using the standard copy functions but by creating a new OpExpr node 
and copying the fields on an as-seen-before basis - these are the places where 
my weight gets lost.

To be honest, I consider not using a special copy function a minor design flaw 
:) *sigh*

Anyway, thanks again. But giving a weight to AExpr is just the tip of the 
iceberg when it comes to my final goal - let my weights enter the executor ;)

Regards,
Peter

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Design: Escort info from WHERE clause to executor?

2007-07-25 Thread peter . trautmeier

Von: Heikki Linnakangas [EMAIL PROTECTED]
 [EMAIL PROTECTED] wrote:
  To sum up, I am looking for a (decently efficient) scheme that is able
 to
  
  (1) pass arbitrary conditional expressions from WHERE to the executor in
 a structure preserving way. 
  (2) annotate arbitrary expressions with weights that survive on its way
 from the parser to the executor.
  (3) access the logical value of particular subexpressions.
  
  I have some basic ideas how at least some of the requirements might be
 achieved. But as I am not totally satisfied with my ideas I hope you can
 provide me with some fresh input.
 
 Why? What are you trying to achieve?

I am implementing a technique that sorts a result set according to weight 
annotations in the WHERE.

The query

SELECT * FROM cars 
WHERE (cdChanger=1){2} 
   OR (mp3player=1){1} 

would be sorted according to partial conditions that hold.

Cars that have both a CD changer AND a MP3 player get a weight of 3, i.e. (2+1).
Cars that only have a CD changer get a weight of 2.
Cars that only have a MP3 player get a weight of 1.
Cars that have neither a CD changer nor a MP3 player do not belong to the 
result set anyway.

I have to sort the tuples according to their individual weight - that is why I 
need to annotate arbitrary expressions with weights.

This is a simple example, but in case of cascaded ORs and ANDs the semantics 
gets slightly trickier - that is why I need the structure of the original WHERE 
clause preserved.

The executor as it stands now is evaluating quals in a short circuit manner, 
and that is totally feasible: As soon as a single subexpression of and ANDed 
qual is false, it stops. 

However, in order to sort the tuples in the result set I generally need to know 
the logical values of all subexpressions, or at least more of them as the 
executor needs for its rather coarsely grained decision 'belongs-to-result-set' 
or 'does-not-belong'.

In general, I have to evaluate the original WHERE expr tree level by level, 
starting at the top(root) that represents the whole condition and possibly 
visiting and evaluating every single subexpression in the tree to every leaf, 
until I have enough information to compare two tuples with each other.

I hope I got the basic idea across, but please don't hesitate to ask for more 
details if I failed to paint the picture clearly enough. I appreciate your 
interest.

(I know that seems complicated, and indeed I am getting the feeling that this 
_is_ complicated. But hey, then again it's my duty ;) )

Regards,
Peter

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Design: Escort info from WHERE clause to executor?

2007-07-25 Thread peter . trautmeier
Thanks Heikki,

Von: Heikki Linnakangas [EMAIL PROTECTED]
  I am implementing a technique that sorts a result set according to
 weight annotations in the WHERE.
  
  The query
  
  SELECT * FROM cars 
  WHERE (cdChanger=1){2} 
 OR (mp3player=1){1} 
  
  would be sorted according to partial conditions that hold.
 
 You could do that like this, with no need to hack the backend:
 
 SELECT * FROM cars
 WHERE (cdChanger=1)
OR (mp3player=1)
 ORDER BY (CASE WHEN cdchanger=1 THEN 2 ELSE 0 END) +
  (CASE WHEN mp3player=1 THEN 1 ELSE 0 END) DESC;

that is certainly a nice, straigthforward solution, however, I forgot to 
mention my most important goal: I want to optimize those weighted queries 
within pg.

As I believe these weights add - in certain cases - additional info that can be 
exploited to gain nice performance improvements in the face of top-k queries.

Just 2 examples for top-k query optimization:

1) Consider a query like

SELECT * FROM cars 
WHERE (cdChanger=1){1} 
   OR (mp3player=1){42} 
   OR (resemblesBatmobile=1){4711}
LIMIT 10;

where the weights differ considerably. Here it might suffice to consider tuples 
with (resemblesBatmobile=1){4711} at first and filter with that condition. If 
this pre filtering yields (at least) 10 tuples, we're left off with the task to 
sort just the remaining set according to the remaining conditions and their 
weights (cdChanger=1){1} and (mp3player=1){42}. 

In a case like 
...
WHERE (cdChanger=1){1} 
   OR (mp3player=1){42} 
   OR (resemblesBatmobile=1){4711}
   OR (resemblesKITT=1){4711}

where two weights are equal the selectivities of resemblesBatmobile=1 and 
resemblesKITT=1 could aid the optimizers decision which condition nicest for a 
pre filtering.
(Consider how great the benefit is in case of expensive predicates like 
subqueries, contrary to these toy comparison ops of mine.)

And so on - your idea here.

2) The second example is a form of join optimization. Consider a query with a 
simple join

SELECT *
FROM   R,S
WHERE  R.foobar = S.foobar
   AND ((R.foo = 1){10}
 OR (S.bar = 6){20})
LIMIT  10;

where again just 10 tuples are needed. Normally, the planner would generate a 
join that evaluates the R.foo = 1 or S.bar = 6 - there's no other choice.

However by rewriting the condition in an equivalent form we can exploit the 
weights by creating 3 Joins instead of a single one:

WHERE  R.foobar = S.foobar
   AND ((R.foo  = 1){10}
 AND (S.bar = 6){20}) -- cream of the crop: weight 30
 ORii
   AND ((R.foo  1){10}
 AND (S.bar = 6){20}) -- second class citizens: weight 20
 OR
   AND ((R.foo = 1){10}
 AND (S.bar  6){20})-- still ok: weight 10

By rewriting we gain 2 nice properties:

1. The conditions can be pushed from the (now 3) joins to the selection at the 
leaf-relation.

2. The tuples are disjunctive regarding their weights: The first join yields 
the best tuples with weight 30. If it gains the needed 10 tuples, we're set - 
voila. Otherwise, well, let's consider the next join with the w20-tuples. 
Iterate.

I believe there are many more ways to speed up execution with the help of 
weights.

That's why I need the weights around in the executor, desperately :)

Regards,
Peter


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


[HACKERS] How to add a column in a executor node?

2007-08-24 Thread peter . trautmeier
Hi all,

I want to add a column, i.e. an additional TargetEntry, in an executor node 
named Foo that is placed on top of the usually created plan. This node Foo will 
calculate the column which is then used to sort the relation subsequently. If 
desired, the column added by Foo is finally removed by a projection node, and 
might thus be a junk col.

(Projection 
that removes bar, 
if desired)
 |
 |
Sort,
sorts by bar
 |
 |
Foo,
adds column bar
 |
 |
Usual
Plan
 |
 |
...


How do I add an additional TargetEntry in my Foo node? Unfortunately, it does 
not suffice to simply append the TargetEntry after copying the child's 
targetlist in make_foo(Plan *). (I tested it and seems that some client side 
code complains about 'advertised_#cols != actual_#cols')

How do I compute/evaluate the column? I just see ExecProject which  evaluates 
the _whole_ targetlist with ExecTargetList. But I would like to retrieve a 
tuple from the child node, evaluate my expression, wrap it into a TargetEntry 
and append it to the tuple. Iterate.

Regards,
Peter

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[HACKERS] ecxt_scantuple has wrong TupleDesc

2007-12-17 Thread peter . trautmeier
Hi all,

I wonder why my ecxt_scantuple has a TupleDesc matching the subplan's tlist, 
not my plan's tlist. Basically, my question is what is x in

econtext-ecxt_scantuple = x ?

I have written a new executor node Foo, with corresponding ExecFoo and make_foo 
functions. I have also written a new Expr type called Bar, along with a 
ExecEvalBar. 

In make_foo I append some Bar columns. When I try to evaluate the Bar columns 
in ExecFoo via ExecProject, I need the Bar columns to access each other. 

But sadly, the ecxt_scantuple doesn't have any Bar columns because it has a 
TupleDesc from the subplan.

What are the steps I have to take to make sure I can access my Bar columns 
(which only exist in Foo) in ExecEvalBar? Where do I get a TupleTableSlot with 
the desired TupleDesc from?

I'll be happy to provide more information, if needed.

Regards
Peter

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] ecxt_scantuple has wrong TupleDesc

2007-12-18 Thread peter . trautmeier
Thanks Tom, 

that made it clear I made a mistake.

 That's the way it's supposed to be --- the scantuple slot is for
 scanning your subplan's output.

Browsing through the code I get the impression, that

- ecxt_scantuple is only used by Scan nodes (i.e. SeqScan, IndexScan, 
SubqueryScan and FunctionScan)

- ecxt_innertuple and ecxt_outertuple are used by Join nodes (but they don't 
use the ecxt_scantuple at all).

Is it right that the ecxt_scantuple is only used by leafnodes (Scans) and never 
by Joins?

  I have written a new executor node Foo, with corresponding ExecFoo and
 make_foo functions. I have also written a new Expr type called Bar, along
 with a ExecEvalBar. 
 
  In make_foo I append some Bar columns. When I try to evaluate the Bar
 columns in ExecFoo via ExecProject, I need the Bar columns to access each
 other. 
 
 That makes no sense at all.  ExecProject can't be expected to access
 output columns of the current node --- they haven't been computed yet.

Conceptionally my only way is then to crack the dependency between the Bar 
columns by splitting the computation in different nodes - and then put these 
nodes atop of each other.

But then, I still don't get the relationship between

 INNER, OUTER varnos on the one side and 
 ecxt_scantuple, ecxt_outertuple and ecxt_innertuple on the other side.

May a non-leaf node refer to a Var with a 'normal' scan varno or only to INNER 
and OUTER varnos?

How is my Foo node supposed to access the suplan's (whatever that might me, a 
Scan or a Join) columns? Should Foo'- tlist have Vars with OUTER varnos and the 
right varattos?

I cannot find an example of a node that (1) does projection and (2) is not a 
Scan (i.e a leaf node) and (3) uses ecxt_scantuple to pipe information to an 
upper node.

I'd be happy if you clarified my misconceptions, I greatly appreciate your help.

Regards
Peter

---(end of broadcast)---
TIP 6: explain analyze is your friend