Re: [HACKERS] "Constraint exclusion" is not general enough

2006-08-08 Thread Florian G. Pflug

Tom Lane wrote:

"Florian G. Pflug" <[EMAIL PROTECTED]> writes:

Tom Lane wrote:

But you don't have any cost numbers until after you've done the plan.



Couldn't this work similar to geqo_effort? The planner could
try planning the query using only cheap algorithmns, and if
the cost exceeds a certain value, it'd restart, and use
more sophisticated methods.


AFAICS this would be a net loss on average.  Most of the time, the
constraint exclusion code doesn't win, and so throwing away all your
planning work to try it is going to be a loser most of the time.


On the other hand, if the "consider-replanning" threshold is high enough,
than that additional time really doesn't matter - If a query runs for minutes,
or even hours, a few wasted cycles during planning don't hurt.

greetings, Florian Pflug

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] "Constraint exclusion" is not general enough

2006-08-07 Thread Rod Taylor
On Mon, 2006-08-07 at 22:01 -0400, Tom Lane wrote:
> "Florian G. Pflug" <[EMAIL PROTECTED]> writes:
> > Tom Lane wrote:
> >> But you don't have any cost numbers until after you've done the plan.
> 
> > Couldn't this work similar to geqo_effort? The planner could
> > try planning the query using only cheap algorithmns, and if
> > the cost exceeds a certain value, it'd restart, and use
> > more sophisticated methods.
> 
> AFAICS this would be a net loss on average.  Most of the time, the
> constraint exclusion code doesn't win, and so throwing away all your
> planning work to try it is going to be a loser most of the time.

If constraint exclusion does not make any changes, mark the plan as
invalid, then there is no need to replan.

 1. Generate plan cheaply
 2. If under $threshold, execute query. The cost of further planning
is significant compared to executing this potentially
non-optimal plan.
 3. Run constraint exclusion. If it changes the clauses due to
constraint exclusion, mark the plan as invalid. I assume
constraint exclusion is relatively self contained.
 4. Invalid plan is replanned. Still valid plan (no potential
improvements can be made) is executed.

-- 


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] "Constraint exclusion" is not general enough

2006-08-07 Thread Tom Lane
"Florian G. Pflug" <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> But you don't have any cost numbers until after you've done the plan.

> Couldn't this work similar to geqo_effort? The planner could
> try planning the query using only cheap algorithmns, and if
> the cost exceeds a certain value, it'd restart, and use
> more sophisticated methods.

AFAICS this would be a net loss on average.  Most of the time, the
constraint exclusion code doesn't win, and so throwing away all your
planning work to try it is going to be a loser most of the time.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] "Constraint exclusion" is not general enough

2006-08-07 Thread Florian G. Pflug

Tom Lane wrote:

Rod Taylor <[EMAIL PROTECTED]> writes:

A simple way of doing this might be to use a minimum cost number?


But you don't have any cost numbers until after you've done the plan.


Couldn't this work similar to geqo_effort? The planner could
try planning the query using only cheap algorithmns, and if
the cost exceeds a certain value, it'd restart, and use
more sophisticated methods.

greetings, Florian Pflug

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] "Constraint exclusion" is not general enough

2006-08-07 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes:
> To achieve the "indexed" partition pruning, we'd need
> - a way to specify that all constraints are mutually exclusive
> - a declarative approach for saying something like "arranged in date
> sequence"
> - preferably a way to have this happen at run-time so we can hard-plan a
> query with CURRENT_TIMESTAMP in the WHERE clause

Definitely a direction worth pursuing, but it seems like it would be a
completely separate code path from the existing constraint-checking
code.  I'd imagine that instead of having to prove theorems about which
tables to scan, a declarative approach would let us "just know" what
to do.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] "Constraint exclusion" is not general enough

2006-08-07 Thread Rod Taylor
On Mon, 2006-08-07 at 13:44 -0400, Tom Lane wrote:
> Rod Taylor <[EMAIL PROTECTED]> writes:
> > A simple way of doing this might be to use a minimum cost number?
> 
> But you don't have any cost numbers until after you've done the plan.

Isn't it possible to find the cost using the straight forward (fast)
method, find out what order of magnitude it is in, then do a second pass
with additional planner bells and whistles turned on if the first plan
had a high estimate?

-- 


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] "Constraint exclusion" is not general enough

2006-08-07 Thread Tom Lane
Rod Taylor <[EMAIL PROTECTED]> writes:
> A simple way of doing this might be to use a minimum cost number?

But you don't have any cost numbers until after you've done the plan.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] "Constraint exclusion" is not general enough

2006-08-07 Thread Rod Taylor
On Mon, 2006-08-07 at 16:54 +0100, Simon Riggs wrote:
> On Fri, 2006-08-04 at 14:40 -0400, Tom Lane wrote:
> > I was just looking at Martin Lesser's gripe here:
> > http://archives.postgresql.org/pgsql-performance/2006-08/msg00053.php
> > about how the planner is not real bright about the filter conditions
> > it generates for a simple partitioning layout.  In particular it's
> > generating scans involving self-contradictory conditions:
> > 
> >  Result  (cost=0.00..33.20 rows=6 width=36)
> >->  Append  (cost=0.00..33.20 rows=6 width=36)
> >  ->  Seq Scan on t_parted  (cost=0.00..33.20 rows=6 width=36)
> >Filter: ((id1 >= 0) AND (id1 < 100) AND (id1 >= 900) AND 
> > (id1 < 1000))
> > 
> > which it seems we ought to be bright enough to notice.  In particular
> > I would argue that turning on constraint_exclusion ought to instruct
> > the planner to catch this sort of thing, whereas when it's off we
> > ought not expend the cycles.  I have a preliminary patch (below)
> > that seems to fix it.
> > 
> > The problem I'm having is that this isn't "constraint exclusion" anymore
> > --- it will in fact make useful deductions without a table constraint
> > anywhere in sight.  Should we rename the GUC variable, and if so to what?
> > Or just live with the misnomer?  I guess plan C would be to invent a
> > separate GUC variable for the other kind of test, but I can't see that
> > it's worth having two.  Thoughts?
> 
> In general, I'd prefer a control that allowed "amount of planning" to be
> specified, much in the same way we rate error messages. We really want
> just one simple knob that can be turned up or down, no matter how many
> new optimizations we add.
> 
> planning_effort = LOW | MEDIUM | HIGH | VERYHIGH | EXHAUSTIVE

A simple way of doing this might be to use a minimum cost number?

# Minimum cost of query is over 100 before applying
mutual_exclusion = 100

Once applied if the filter accomplished something the query is replanned
or adjusted to take that change into account.

If there were a large number of constraints on t_parted it may well have
taken longer to plan than to execute on the 6 rows. If there were 1M
rows in the structure, the extra effort would have been well worth it.


Ideally we could set the planning time as a percentage of total
execution time and let PostgreSQL figure out what should be tried and
when, but that means giving a cost to planner functionality and having
PostgreSQL plan how to plan.

planning_effort = 5%

-- 


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] "Constraint exclusion" is not general enough

2006-08-07 Thread Simon Riggs
On Fri, 2006-08-04 at 14:40 -0400, Tom Lane wrote:
> I was just looking at Martin Lesser's gripe here:
> http://archives.postgresql.org/pgsql-performance/2006-08/msg00053.php
> about how the planner is not real bright about the filter conditions
> it generates for a simple partitioning layout.  In particular it's
> generating scans involving self-contradictory conditions:
> 
>  Result  (cost=0.00..33.20 rows=6 width=36)
>->  Append  (cost=0.00..33.20 rows=6 width=36)
>  ->  Seq Scan on t_parted  (cost=0.00..33.20 rows=6 width=36)
>Filter: ((id1 >= 0) AND (id1 < 100) AND (id1 >= 900) AND (id1 
> < 1000))
> 
> which it seems we ought to be bright enough to notice.  In particular
> I would argue that turning on constraint_exclusion ought to instruct
> the planner to catch this sort of thing, whereas when it's off we
> ought not expend the cycles.  I have a preliminary patch (below)
> that seems to fix it.
> 
> The problem I'm having is that this isn't "constraint exclusion" anymore
> --- it will in fact make useful deductions without a table constraint
> anywhere in sight.  Should we rename the GUC variable, and if so to what?
> Or just live with the misnomer?  I guess plan C would be to invent a
> separate GUC variable for the other kind of test, but I can't see that
> it's worth having two.  Thoughts?

In general, I'd prefer a control that allowed "amount of planning" to be
specified, much in the same way we rate error messages. We really want
just one simple knob that can be turned up or down, no matter how many
new optimizations we add.

planning_effort = LOW | MEDIUM | HIGH | VERYHIGH | EXHAUSTIVE

Though I'd prefer about 6-10 settings

> BTW, the remaining infelicities in Martin's example stem from the fact that
> predicate_refuted_by doesn't recognize "x IS NOT TRUE" as refuting "x".
> Working on fixing that now.

Seen that. Looks good.

> I'm also thinking that formulating the check as "constraints are
> refuted by WHERE clause" might be unnecessarily restrictive.  Doesn't
> the case where a constraint implies falsity of a WHERE clause likewise
> tell us we needn't bother to scan?  Seems like we ought to put all the
> conditions together and run a symmetric test named something like
> "mutually_exclusive_conditions".  Maybe "mutual_exclusion" would be
> a better name for the GUC variable.

That would be another approach to planning this, but I see a future that
may interfere with that suggestion (maybe not, so read on).

Currently, exclusion is tested for each relation that might possibly be
involved, so planning time increases O(N) for N partitions. I would like
to see a declarative approach where the constraints on all of the child
partitions form a coherent pattern that can be used to process far fewer
exclusion tests to enable us to achieve O(logN) behaviour. I'm not sure
how we'd do that if we went for the symmetric test approach.

To achieve the "indexed" partition pruning, we'd need
- a way to specify that all constraints are mutually exclusive
- a declarative approach for saying something like "arranged in date
sequence"
- preferably a way to have this happen at run-time so we can hard-plan a
query with CURRENT_TIMESTAMP in the WHERE clause

The reason for all of that is to better enable partitioning for use in
OLTP situations, not just DW.

With all of that in mind, I think renaming the current GUC would be a
fairly short-lived name change, so I'd suggest sticking with it until
the functions have been enhanced to the point we can dream up a slightly
more evocative name.

-- 
  Simon Riggs
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] "Constraint exclusion" is not general enough

2006-08-07 Thread stark

Tom Lane <[EMAIL PROTECTED]> writes:

> "Jim C. Nasby" <[EMAIL PROTECTED]> writes:
>
>> How many cycles are we talking about here? Is it even worth the GUC?
>
> I think so.  On simple queries the optimization will *never* fire,
> and there's no point in doing the search.  People who are running
> complex queries will want to turn it on, but the mysql-equivalent
> crew will just find it a waste of cycles.

The other class of people who will find this kind of thing useful are those
using automatically generated queries. Frequently you end up with redundant
clauses or "unreachable" clauses that you hope the database will be able to
see through.

Having to enable that intelligence with a GUC is fine though since those users
could just enable it even if they aren't using partitioning. That said I
expect that eventually any option we add whose only purpose is it to enable
some intelligence in the optimizer will become standard.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] "Constraint exclusion" is not general enough

2006-08-04 Thread Tom Lane
"Jim C. Nasby" <[EMAIL PROTECTED]> writes:
> On Fri, Aug 04, 2006 at 02:40:30PM -0400, Tom Lane wrote:
>> I would argue that turning on constraint_exclusion ought to instruct
>> the planner to catch this sort of thing, whereas when it's off we
>> ought not expend the cycles.  I have a preliminary patch (below)
>> that seems to fix it.

> How many cycles are we talking about here? Is it even worth the GUC?

I think so.  On simple queries the optimization will *never* fire,
and there's no point in doing the search.  People who are running
complex queries will want to turn it on, but the mysql-equivalent
crew will just find it a waste of cycles.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] "Constraint exclusion" is not general enough

2006-08-04 Thread Jim C. Nasby
On Fri, Aug 04, 2006 at 02:40:30PM -0400, Tom Lane wrote:
> which it seems we ought to be bright enough to notice.  In particular
> I would argue that turning on constraint_exclusion ought to instruct
> the planner to catch this sort of thing, whereas when it's off we
> ought not expend the cycles.  I have a preliminary patch (below)
> that seems to fix it.

How many cycles are we talking about here? Is it even worth the GUC?

The most heavily loaded systems I've seen were doing on the order of 500
transactions a second (which would have been almost entirely
single-statement queries), so ISTM that you've got to be burning a
pretty good chunk of CPU time on planning before it becomes an issue.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] "Constraint exclusion" is not general enough

2006-08-04 Thread Hannu Krosing
Ühel kenal päeval, R, 2006-08-04 kell 14:40, kirjutas Tom Lane:
> I was just looking at Martin Lesser's gripe here:
> http://archives.postgresql.org/pgsql-performance/2006-08/msg00053.php
> about how the planner is not real bright about the filter conditions
> it generates for a simple partitioning layout.  In particular it's
> generating scans involving self-contradictory conditions:
> 
>  Result  (cost=0.00..33.20 rows=6 width=36)
>->  Append  (cost=0.00..33.20 rows=6 width=36)
>  ->  Seq Scan on t_parted  (cost=0.00..33.20 rows=6 width=36)
>Filter: ((id1 >= 0) AND (id1 < 100) AND (id1 >= 900) AND (id1 
> < 1000))
> 
> which it seems we ought to be bright enough to notice.  In particular
> I would argue that turning on constraint_exclusion ought to instruct
> the planner to catch this sort of thing, whereas when it's off we
> ought not expend the cycles.  I have a preliminary patch (below)
> that seems to fix it.
> 
> The problem I'm having is that this isn't "constraint exclusion" anymore
> --- it will in fact make useful deductions without a table constraint
> anywhere in sight. 

I'd just keep the name. the parts of WHERE cluse can be described as
constraints on what is returned :)


-- 

Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:  http://www.skype.com



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq