Thanks Heikki,

Von: Heikki Linnakangas <[EMAIL PROTECTED]>
> > I am implementing a technique that sorts a result set according to
> weight annotations in the WHERE.
> > 
> > The query
> > 
> > SELECT * FROM cars 
> > WHERE (cdChanger=1){2} 
> >    OR (mp3player=1){1} 
> > 
> > would be sorted according to partial conditions that hold.
> You could do that like this, with no need to hack the backend:
> SELECT * FROM cars
> WHERE (cdChanger=1)
>    OR (mp3player=1)
> ORDER BY (CASE WHEN cdchanger=1 THEN 2 ELSE 0 END) +
>          (CASE WHEN mp3player=1 THEN 1 ELSE 0 END) DESC;

that is certainly a nice, straigthforward solution, however, I forgot to 
mention my most important goal: I want to optimize those weighted queries 
within pg.

As I believe these weights add - in certain cases - additional info that can be 
exploited to gain nice performance improvements in the face of top-k queries.

Just 2 examples for top-k query optimization:

1) Consider a query like

WHERE (cdChanger=1){1} 
   OR (mp3player=1){42} 
   OR (resemblesBatmobile=1){4711}

where the weights differ considerably. Here it might suffice to consider tuples 
with (resemblesBatmobile=1){4711} at first and filter with that condition. If 
this pre filtering yields (at least) 10 tuples, we're left off with the task to 
sort just the remaining set according to the remaining conditions and their 
weights (cdChanger=1){1} and (mp3player=1){42}. 

In a case like 
WHERE (cdChanger=1){1} 
   OR (mp3player=1){42} 
   OR (resemblesBatmobile=1){4711}
   OR (resemblesKITT=1){4711}

where two weights are equal the selectivities of resemblesBatmobile=1 and 
resemblesKITT=1 could aid the optimizers decision which condition nicest for a 
pre filtering.
(Consider how great the benefit is in case of expensive predicates like 
subqueries, contrary to these toy comparison ops of mine.)

And so on - your idea here.

2) The second example is a form of join optimization. Consider a query with a 
simple join

WHERE  R.foobar = S.foobar
   AND (( = 1){10}
     OR ( = 6){20})
LIMIT  10;

where again just 10 tuples are needed. Normally, the planner would generate a 
join that evaluates the = 1 or = 6 - there's no other choice.

However by rewriting the condition in an equivalent form we can exploit the 
weights by creating 3 Joins instead of a single one:

WHERE  R.foobar = S.foobar
   AND ((  = 1){10}
     AND ( = 6){20})     --> cream of the crop: weight 30
   AND (( <> 1){10}
     AND ( = 6){20})     --> second class citizens: weight 20
   AND (( = 1){10}
     AND ( <> 6){20})    --> still ok: weight 10

By rewriting we gain 2 nice properties:

1. The conditions can be pushed from the (now 3) joins to the selection at the 

2. The tuples are disjunctive regarding their weights: The first join yields 
the best tuples with weight 30. If it gains the needed 10 tuples, we're set - 
voila. Otherwise, well, let's consider the next join with the w20-tuples. 

I believe there are many more ways to speed up execution with the help of 

That's why I need the weights around in the executor, desperately :)


---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at


Reply via email to