Re: [HACKERS] parameter hints to the optimizer

2004-08-12 Thread Bruce Momjian

OK, I added this TODO:


* Allow finer control over the caching of prepared query plans

  Currently, queries prepared via the libpq API are planned on first
  execute using the supplied parameters --- allow SQL PREPARE to do the
  same.  Also, allow control over replanning prepared queries either
  manually or automatically when statistics for execute parameters
  differ dramatically from those used during planning.

---

Oliver Jowett wrote:
 Bruce Momjian wrote:
  Oliver Jowett wrote:
  
 Bruce Momjian wrote:
 
 Oliver Jowett wrote:
 
 
 Merlin Moncure wrote:
 
 
 
 Another way to deal with the problem is to defer plan generation until
 the first plan execution and use the parameters from that execution.
 
 When talking the V3 protocol, 7.5 defers plan generation for the unnamed 
 statement until parameters are received in the Bind message (which is 
 essentially the same as what you describe). There was some discussion at 
 the time about making it more flexible so you could apply it to arbitary 
 statements, but that needed a protocol change so it didn't happen.
 
 
 What do you mean about arbitrary statements?  Non-prepared ones, or
 non-unnamed ones?
 
 Non-unnamed ones. Adding flag on the Parse message that says when to 
 plan the statement (or maybe on each Bind message even).
  
  
  OK, what are unnamed prepared statements?  When are they used currently?
  Only via the wire protocol?  Who uses them now?
 
 The unnamed prepared statement is like any other prepared statement 
 except it doesn't have a name :)  It can be accessed via:
 
 1) V3 protocol Parse/Bind with an empty statement name uses the unnamed 
 statement.
 2) V2 or V3 simple query implicitly closes the unnamed statement.
 
 CVS HEAD defers planning in case (1) until the Bind is received so it 
 can do planning cost estimation using concrete parameter values and 
 produce a better plan. It only does this for the unnamed statement, not 
 for named statements. If you Parse into a named statement, planning 
 happens immediately when the Parse is done.
 
 This behaviour gives the client some flexibility without changing the 
 protocol. It means that using Parse/Bind on the unnamed statement with 
 parameters is essentially equivalent planning-wise to substituting the 
 parameter values into the actual query and submitting that instead.
 
 What we talked about briefly was providing some way to control when 
 planning was done on a per-statement basis -- so you could say don't 
 defer planning for this unnamed query because I'm going to reuse the 
 unnamed statement multiple times and the first set of parameters might 
 not generate an efficient plan or do defer planning of this named 
 query because I know I will be executing it with many similar parameter 
 values and estimating using the first set of parameters gives a good plan.
 
 Or an alternative is to have a way to control query replanning on each 
 Bind individually -- so a client can get the benefit of skipping the 
 parse step on subsequent executions and is able to pass parameters via 
 Bind, but the query is replanned for the concrete parameter values on 
 each execution. The JDBC driver wants this -- currently the use of named 
 statements has to be explicitly turned on as with the current behaviour 
 you may take a performance hit due to less-than-ideal plans as soon as 
 you start using named statements.
 
 So maybe the TODO should be something like allow finer-grained client 
 control of query estimation and (re-)planning when using Parse/Bind.
 
 -O
 
 ---(end of broadcast)---
 TIP 7: don't forget to increase your free space map settings
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] parameter hints to the optimizer

2004-08-07 Thread Oliver Jowett
Bruce Momjian wrote:
Oliver Jowett wrote:
Merlin Moncure wrote:

Another way to deal with the problem is to defer plan generation until
the first plan execution and use the parameters from that execution.
When talking the V3 protocol, 7.5 defers plan generation for the unnamed 
statement until parameters are received in the Bind message (which is 
essentially the same as what you describe). There was some discussion at 
the time about making it more flexible so you could apply it to arbitary 
statements, but that needed a protocol change so it didn't happen.

What do you mean about arbitrary statements?  Non-prepared ones, or
non-unnamed ones?
Non-unnamed ones. Adding flag on the Parse message that says when to 
plan the statement (or maybe on each Bind message even).

-O
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] parameter hints to the optimizer

2004-08-07 Thread Bruce Momjian
Oliver Jowett wrote:
 Bruce Momjian wrote:
  Oliver Jowett wrote:
  
 Merlin Moncure wrote:
 
 
 Another way to deal with the problem is to defer plan generation until
 the first plan execution and use the parameters from that execution.
 
 When talking the V3 protocol, 7.5 defers plan generation for the unnamed 
 statement until parameters are received in the Bind message (which is 
 essentially the same as what you describe). There was some discussion at 
 the time about making it more flexible so you could apply it to arbitary 
 statements, but that needed a protocol change so it didn't happen.
  
  
  What do you mean about arbitrary statements?  Non-prepared ones, or
  non-unnamed ones?
 
 Non-unnamed ones. Adding flag on the Parse message that says when to 
 plan the statement (or maybe on each Bind message even).

OK, what are unnamed prepared statements?  When are they used currently?
Only via the wire protocol?  Who uses them now?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] parameter hints to the optimizer

2004-08-07 Thread Oliver Jowett
Bruce Momjian wrote:
Oliver Jowett wrote:
Bruce Momjian wrote:
Oliver Jowett wrote:

Merlin Moncure wrote:

Another way to deal with the problem is to defer plan generation until
the first plan execution and use the parameters from that execution.
When talking the V3 protocol, 7.5 defers plan generation for the unnamed 
statement until parameters are received in the Bind message (which is 
essentially the same as what you describe). There was some discussion at 
the time about making it more flexible so you could apply it to arbitary 
statements, but that needed a protocol change so it didn't happen.

What do you mean about arbitrary statements?  Non-prepared ones, or
non-unnamed ones?
Non-unnamed ones. Adding flag on the Parse message that says when to 
plan the statement (or maybe on each Bind message even).

OK, what are unnamed prepared statements?  When are they used currently?
Only via the wire protocol?  Who uses them now?
The unnamed prepared statement is like any other prepared statement 
except it doesn't have a name :)  It can be accessed via:

1) V3 protocol Parse/Bind with an empty statement name uses the unnamed 
statement.
2) V2 or V3 simple query implicitly closes the unnamed statement.

CVS HEAD defers planning in case (1) until the Bind is received so it 
can do planning cost estimation using concrete parameter values and 
produce a better plan. It only does this for the unnamed statement, not 
for named statements. If you Parse into a named statement, planning 
happens immediately when the Parse is done.

This behaviour gives the client some flexibility without changing the 
protocol. It means that using Parse/Bind on the unnamed statement with 
parameters is essentially equivalent planning-wise to substituting the 
parameter values into the actual query and submitting that instead.

What we talked about briefly was providing some way to control when 
planning was done on a per-statement basis -- so you could say don't 
defer planning for this unnamed query because I'm going to reuse the 
unnamed statement multiple times and the first set of parameters might 
not generate an efficient plan or do defer planning of this named 
query because I know I will be executing it with many similar parameter 
values and estimating using the first set of parameters gives a good plan.

Or an alternative is to have a way to control query replanning on each 
Bind individually -- so a client can get the benefit of skipping the 
parse step on subsequent executions and is able to pass parameters via 
Bind, but the query is replanned for the concrete parameter values on 
each execution. The JDBC driver wants this -- currently the use of named 
statements has to be explicitly turned on as with the current behaviour 
you may take a performance hit due to less-than-ideal plans as soon as 
you start using named statements.

So maybe the TODO should be something like allow finer-grained client 
control of query estimation and (re-)planning when using Parse/Bind.

-O
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] parameter hints to the optimizer

2004-08-06 Thread Bruce Momjian
Oliver Jowett wrote:
 Merlin Moncure wrote:
 
  Another way to deal with the problem is to defer plan generation until
  the first plan execution and use the parameters from that execution.
 
 When talking the V3 protocol, 7.5 defers plan generation for the unnamed 
 statement until parameters are received in the Bind message (which is 
 essentially the same as what you describe). There was some discussion at 
 the time about making it more flexible so you could apply it to arbitary 
 statements, but that needed a protocol change so it didn't happen.

What do you mean about arbitrary statements?  Non-prepared ones, or
non-unnamed ones?

I am trying to figure out what TODO item needs to be added.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


[HACKERS] parameter hints to the optimizer

2004-07-21 Thread Merlin Moncure
There is one problem with the optimizer that is a constant source of
frustration.  When using prepared statements and functions (particularly
where function parameters are passed to a query), the optimizer often
fails to utilize an index inside a plan.  

This is a well known problem because the parameter values are not known
at the time the plan is generated, making things difficult for the
optimizer.

It would be nice if 'hint' or sample parameters could be used when
creating the statement so the optimizer could use those values when
generating the plan.  For example, the default parameter syntax of C++
could be borrowed (assuming this doesn't break any syntax rules).

example:
prepare my_statement prepare (character varying='abc')
as select * from t where t.k = $1;

create function my_function(int4=1234) returns [...]

Another way to deal with the problem is to defer plan generation until
the first plan execution and use the parameters from that execution.

Am I crazy?  Comments?
Merlin


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] parameter hints to the optimizer

2004-07-21 Thread Hans-Jürgen Schönig
Merlin Moncure wrote:
Merlin,
This will most likely never be accepted by the core team because it is
better to spend more time on fixing the planner than to invent some
non-standard.
As far as I know some databases support a syntax where hints can be
hidden in comments or something like that.
Meanwhile I think that hints are more of a burdon than a help.
Regards,
		Hans

I thought as much.  Still, the parameter problem is a huge headache.  Maybe if it was 
possible to use the statistics to gather a 'pseudovariable' to feed to the parameter 
based on some algorithm, the planner could give better results without exposing the 
planner inner workings to the user.
Thanks for the feedback.
Merlin

Using statistics is exactly what the planner does ...
So why should data coming from the planner being given back to the 
planner? Doesn't make sense.

For more information I highly recommend Tom's talk at Oreilly's some 
time ago. I think it is called recent improvements in 7.4 ...

regards,
Hans

--
Cybertec Geschwinde u Schoenig
Schoengrabern 134, A-2020 Hollabrunn, Austria
Tel: +43/720/10 1234567 or +43/664/816 40 77
www.cybertec.at, www.postgresql.at, kernel.cybertec.at
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] parameter hints to the optimizer

2004-07-21 Thread Simon Riggs
On Wed, 2004-07-21 at 19:12, Merlin Moncure wrote:
 There is one problem with the optimizer that is a constant source of
 frustration.  When using prepared statements and functions (particularly
 where function parameters are passed to a query), the optimizer often
 fails to utilize an index inside a plan.  
 
 This is a well known problem because the parameter values are not known
 at the time the plan is generated, making things difficult for the
 optimizer.
 
 It would be nice if 'hint' or sample parameters could be used when
 creating the statement so the optimizer could use those values when
 generating the plan.  For example, the default parameter syntax of C++
 could be borrowed (assuming this doesn't break any syntax rules).
 
 example:
 prepare my_statement prepare (character varying='abc')
 as select * from t where t.k = $1;
 
 create function my_function(int4=1234) returns [...]
 
 Another way to deal with the problem is to defer plan generation until
 the first plan execution and use the parameters from that execution.
 
 Am I crazy?  Comments?

Crazy enough to suggest what other RDBMS do.

It's a common problem, since it defeats the use of histogram statistics
to determine specific cardinality rather than generic cardinality.

The answer is to follow what those others do, since programs will be
written to take advantage of those optimization quirks.

DB2 supports various modes for BIND: REOPT(ALWAYS), REOPT(ONCE),
REOPT(VARS) and REOPT(NONE) - which are then manifested in their
precompiler.

..back to you,

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] parameter hints to the optimizer

2004-07-21 Thread Oliver Jowett
Merlin Moncure wrote:
Another way to deal with the problem is to defer plan generation until
the first plan execution and use the parameters from that execution.
When talking the V3 protocol, 7.5 defers plan generation for the unnamed 
statement until parameters are received in the Bind message (which is 
essentially the same as what you describe). There was some discussion at 
the time about making it more flexible so you could apply it to arbitary 
statements, but that needed a protocol change so it didn't happen.

So the guts of the work are done -- we'd just need a way to trigger the 
behaviour on demand. It sounds less painful to add something to PREPARE 
than to change the V3 protocol at this stage.

-O
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match