Re: [PERFORM] Hints (was Poor performance using CTE)

2012-12-03 Thread Gavin Flower

On 28/11/12 15:17, Craig Ringer wrote:

On 27/11/2012 3:42 PM, Scott Marlowe wrote:


Here here!  PostgreSQL is well known for its extensibility and this is
the perfect place for hints.
I agree with the sentiment and your concerns. However, this doesn't 
solve the CTE problem.


Some people are relying on the planner's inability to push conditions 
into / pull conditions out of CTEs, and otherwise re-arrange them. If 
support for optimising into eligible CTEs (ie CTE terms that contain 
only SELECT or VALUES and call no VOLATILE functions) then these 
applications will potentially encounter serious performance regressions.


Should this feature never be added to Pg, making it different and 
incompatible with other DBs that implement CTE optimisation, just 
because some people are using it for a hacky hint like OFFSET 0?


Should these applications just be broken by the update, with people 
told to add `OFFSET 0` or load some not-yet-existing hints module 
after reporting the performance issue to the list?


I don't think either of those are acceptable. Sooner or later 
somebody's going to want to add CTE optimisation, and I don't think 
that you can't or great, we'll do it and break everything are 
acceptable responses to any proposed patch someone might come up with 
to add that.


A GUC might be OK, as apps can always SET it before problem queries or 
not-yet-ported code. It'd probably reduce the rate at which people 
fixed their code considerably, though, going by past experience with 
standard_conforming_strings, etc, but it'd work.


--
Craig Ringer



I think it would be best to be something in the SQL for SELECT, as:

1. One is more likely to find it by looking up the documentation for SELECT

2. It could allow selective application within a SELECT: one could have
   several queries within the WITH clause: where all except one might
   benefit for optimisation, and the exception might cause problems

I have suggested a couple possible syntax paterns, but there may well be 
better alternative syntaxes.



Cheers,
Gavin


Re: [PERFORM] Hints (was Poor performance using CTE)

2012-11-27 Thread Scott Marlowe
On Fri, Nov 23, 2012 at 3:05 AM, Cédric Villemain
ced...@2ndquadrant.com wrote:
 Le mercredi 21 novembre 2012 17:34:02, Craig James a écrit :
 On Wed, Nov 21, 2012 at 5:42 AM, Kevin Grittner kgri...@mail.com wrote:
  It's a tough problem. Disguising and not documenting the available
  optimizer hints leads to more reports on where the optimizer should
  be smarter, and has spurred optimizer improvements. ...
  Regarding the above-mentioned benefits we would stand to lose by
  having clear and documented hints, perhaps we could occasionally
  solicit input on where people are finding hints useful to get ideas
  on where we might want to improve the optimizer. As far as worrying
  about people using hints to force a plan which is sub-optimal --
  isn't that getting into nanny mode a bit too much?

 Toward that end, the hint documentation (which is almost always viewed as
 HTML) could be prefaced by a strong suggestion to post performance
 questions in this group first, with links to the subscribe page and the
 how to report performance problems FAQ. The hint documentation could even
 be minimalistic; suggest to developers that they should post their
 problematic queries here before resorting to hints.  That would give the
 experts an opportunity to provide the normal advice.  The correct hint
 syntax would be suggested only when all other avenues failed.

 We have hooks in PostgreSQL. We already have at least one extension which is
 using that to change the planner behavior.

 We can have a bit more hooks and try to improve the cost estimate, this part
 of the code is known to be built by reports and human estimations, also the
 9.2 version got heavy modifications in this area.

 Let the 'Hints' be inside an extension thus we are able to track them and fix
 the planner/costestimate issues.

 I don't see why PostgreSQL needs 'Hints' *in-core*.

Here here!  PostgreSQL is well known for its extensibility and this is
the perfect place for hints.  That way they can get worked on without
becoming a crutch for every user and forcing the backend developers to
support what may or may not be a good idea syntax wise.  After a few
different people have banged some code out to make workable hint
syntaxes for their own use maybe then it will be time to revisit
adding hints to core.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hints (was Poor performance using CTE)

2012-11-27 Thread Craig Ringer

On 27/11/2012 3:42 PM, Scott Marlowe wrote:


Here here!  PostgreSQL is well known for its extensibility and this is
the perfect place for hints.
I agree with the sentiment and your concerns. However, this doesn't 
solve the CTE problem.


Some people are relying on the planner's inability to push conditions 
into / pull conditions out of CTEs, and otherwise re-arrange them. If 
support for optimising into eligible CTEs (ie CTE terms that contain 
only SELECT or VALUES and call no VOLATILE functions) then these 
applications will potentially encounter serious performance regressions.


Should this feature never be added to Pg, making it different and 
incompatible with other DBs that implement CTE optimisation, just 
because some people are using it for a hacky hint like OFFSET 0?


Should these applications just be broken by the update, with people told 
to add `OFFSET 0` or load some not-yet-existing hints module after 
reporting the performance issue to the list?


I don't think either of those are acceptable. Sooner or later somebody's 
going to want to add CTE optimisation, and I don't think that you 
can't or great, we'll do it and break everything are acceptable 
responses to any proposed patch someone might come up with to add that.


A GUC might be OK, as apps can always SET it before problem queries or 
not-yet-ported code. It'd probably reduce the rate at which people fixed 
their code considerably, though, going by past experience with 
standard_conforming_strings, etc, but it'd work.


--
Craig Ringer


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hints (was Poor performance using CTE)

2012-11-27 Thread Scott Marlowe
On Tue, Nov 27, 2012 at 7:17 PM, Craig Ringer cr...@2ndquadrant.com wrote:
 On 27/11/2012 3:42 PM, Scott Marlowe wrote:

 Here here!  PostgreSQL is well known for its extensibility and this is
 the perfect place for hints.

 I agree with the sentiment and your concerns. However, this doesn't solve
 the CTE problem.

 Some people are relying on the planner's inability to push conditions into /
 pull conditions out of CTEs, and otherwise re-arrange them. If support for
 optimising into eligible CTEs (ie CTE terms that contain only SELECT or
 VALUES and call no VOLATILE functions) then these applications will
 potentially encounter serious performance regressions.

 Should this feature never be added to Pg, making it different and
 incompatible with other DBs that implement CTE optimisation, just because
 some people are using it for a hacky hint like OFFSET 0?

I'm strictly talking about any hinting mechanism being added being an
extension.  Fixing the planner so that optimizations can get cross the
CTE boundary seems the domain of back end hackers, not extensions.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hints (was Poor performance using CTE)

2012-11-23 Thread Cédric Villemain
Le mercredi 21 novembre 2012 17:34:02, Craig James a écrit :
 On Wed, Nov 21, 2012 at 5:42 AM, Kevin Grittner kgri...@mail.com wrote:
  It's a tough problem. Disguising and not documenting the available
  optimizer hints leads to more reports on where the optimizer should
  be smarter, and has spurred optimizer improvements. ...
  Regarding the above-mentioned benefits we would stand to lose by
  having clear and documented hints, perhaps we could occasionally
  solicit input on where people are finding hints useful to get ideas
  on where we might want to improve the optimizer. As far as worrying
  about people using hints to force a plan which is sub-optimal --
  isn't that getting into nanny mode a bit too much?
 
 Toward that end, the hint documentation (which is almost always viewed as
 HTML) could be prefaced by a strong suggestion to post performance
 questions in this group first, with links to the subscribe page and the
 how to report performance problems FAQ. The hint documentation could even
 be minimalistic; suggest to developers that they should post their
 problematic queries here before resorting to hints.  That would give the
 experts an opportunity to provide the normal advice.  The correct hint
 syntax would be suggested only when all other avenues failed.

We have hooks in PostgreSQL. We already have at least one extension which is 
using that to change the planner behavior.

We can have a bit more hooks and try to improve the cost estimate, this part 
of the code is known to be built by reports and human estimations, also the 
9.2 version got heavy modifications in this area. 

Let the 'Hints' be inside an extension thus we are able to track them and fix 
the planner/costestimate issues.

I don't see why PostgreSQL needs 'Hints' *in-core*.
-- 
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation


signature.asc
Description: This is a digitally signed message part.


Re: [PERFORM] Hints (was Poor performance using CTE)

2012-11-23 Thread Gavin Flower

On 22/11/12 06:28, Craig James wrote:



On Wed, Nov 21, 2012 at 9:25 AM, Joe Conway m...@joeconway.com 
mailto:m...@joeconway.com wrote:


On 11/21/2012 08:05 AM, Heikki Linnakangas wrote:
 Rather than telling the planner what to do or not to do, I'd
much rather
 have hints that give the planner more information about the
tables and
 quals involved in the query. A typical source of bad plans is
when the
 planner gets its cost estimates wrong. So rather than telling the
 planner to use a nested loop join for a INNER JOIN b ON a.id
http://a.id = b.id http://b.id,
 the user could tell the planner that there are only 10 rows that
match
 the a.id http://a.id = b.id http://b.id qual. That gives
the planner the information it needs
 to choose the right plan on its own. That kind of hints would be
much
 less implementation specific and much more likely to still be
useful, or
 at least not outright counter-productive, in a future version with a
 smarter planner.

 You could also attach that kind of hints to tables and columns,
which
 would be more portable and nicer than decorating all queries.

I like this idea, but also think that if we have a syntax to allow
hints, it would be nice to have a simple way to ignore all hints
(yes, I
suppose I'm suggesting yet another GUC). That way after sprinkling
your
SQL with hints, you could easily periodically (e.g. after a Postgres
upgrade) test what would happen if the hints were removed.


Or a three-way choice: Allow, ignore, or generate an error. That would 
allow developers to identify where hints are being used.


Craig


Joe
--
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting,  24x7 Support



Or perhaps hints should have the pg version attached, so that they are 
automatically ignored when the pg version changed?  Problem may then 
become people reluctant to upgrade because their hints relate to a 
previous version!  Sigh...


Even requiring registration of hints and expiring them after a limited 
time period would not work - as people would simply automate the process 
of registration  application...



Cheers,
Gavin


Re: [PERFORM] Hints (was Poor performance using CTE)

2012-11-22 Thread Heikki Linnakangas

On 22.11.2012 02:53, Jeff Janes wrote:

That gives the planner the information it needs to choose the right plan on
its own. That kind of hints would be much less implementation specific and
much more likely to still be useful, or at least not outright
counter-productive, in a future version with a smarter planner.


When I run into unexpectedly poor performance, I have an intuitive
enough feel for my own data that I know what plan it ought to be
using.  Figuring out why it is not using it is very hard.  For one
thing, EXPLAIN tells you about the winning plan, but there is no
visibility into what ought to be the winning plan but isn't, so no way
to see why it isn't.So you first have to use our existing non-hint
hints (enable_*, doing weird things with cost_*, CTE stuff) to trick
it into using the plan I want it to use, before I can figure out why
it isn't using it, before I could figure out what hints of the style
you are suggesting to supply to get it to use it.


I'm sure that happens too, but my gut feeling is that more often the 
EXPLAIN ANALYZE output reveals a bad estimate somewhere in the plan, and 
the planner chooses a bad plan based on the bad estimate. If you hint 
the planner by giving a better estimate for where the estimator got it 
wrong, the planner will choose the desired plan.


- Heikki


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hints (was Poor performance using CTE)

2012-11-21 Thread Shaun Thomas

On 11/20/2012 08:15 PM, Craig Ringer wrote:


I think it's time to admit that and get the syntax in place for CTEs so
there's room to optimize them later, rather than cementing
CTEs-as-fences in forever as a Pg quirk.


I know I'm just some schmo, but I'd vote for this. I'm certainly guilty 
of using OFFSET 0. Undocumented hints are still hints. As much as I 
think they're a bad idea by cementing a certain plan that may not get 
the benefits of future versions, non-intuitive side-effects by using 
overloaded syntax are worse.


I've been using CTEs as temp tables because I know that's how they work. 
But I'd be more than willing to modify my syntax one way or the other to 
adopt non-materialized CTEs, provided there's some way to get the 
current behavior.


--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
stho...@optionshouse.com

__

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to 
this email


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hints (was Poor performance using CTE)

2012-11-21 Thread Kevin Grittner
Craig Ringer wrote:
 On 11/21/2012 09:35 AM, Craig James wrote:
 Why not make an explicit hint syntax and document it? I've still
 don't understand why hint is a dirty word in Postgres. There are
 a half-dozen or so ways in common use to circumvent or correct
 sub-optimal plans.
 
 The reason usually given is that hints provide easy workarounds for
 planner and stats issues, so people don't report problems or fix
 the underlying problem.
 
 Of course, if that's all there was to it, `OFFSET 0` would be made
 into an error or warning, or ignored and not fenced.
 
 The reality is, as you say, that there's a need, because the
 planner can never be perfect - or rather, if it were nearly
 perfect, it'd take so long to read the stats and calculate plans
 that everything would be glacially slow anyway. The planner has to
 compromise, and so cases will always arise where it needs a little
 help.
 
 I think it's time to admit that and get the syntax in place for
 CTEs so there's room to optimize them later, rather than cementing
 CTEs-as-fences in forever as a Pg quirk.

It's a tough problem. Disguising and not documenting the available
optimizer hints leads to more reports on where the optimizer should
be smarter, and has spurred optimizer improvements. And many type of
hints would undoubtedly cause people to force what they *think* would
be the best plan in many cases where they are wrong, or become wrong
as data scales up. But it does seem odd every time I hear people
saying that they don't want to eliminate some optimization fence
because they find it useful while simultaneously arguing that we
don't have or want hints.

Having a way to coerce the optimizer from the plan it would take with
straightforward coding *is* a hint, and one down-side of hiding the
hints inside syntax mostly supported for other reasons is that people
who don't know about these clever devices can't do reasonable
refactoring of queries for readability without risking performance
regressions. Another down-side is that perfectly reasonable queries
ported from other databases that use hint syntax for hints run afoul
of the secret hints when trying to run queries on PostgreSQL, and get
performance potentially orders of magnitude worse than they expect.

I'm not sure what the best answer is, but as long as we have hints,
but only through OFFSET 0 or CTE usage, that should be documented.
Better, IMV, would be to identify what sorts of hints people actually
find useful, and use that as the basis for TODO items for optimizer
improvement as well as inventing clear ways to specify the desired
coercion. I liked the suggestion that a CTE which didn't need to be
materialized because of side-effects or multiple references have a
keyword. Personally, I think that AS MATERIALIZED x (SELECT ...)
would be preferable to AS x (SELECT ... OFFSET 0) as the syntax to
specify that.

Regarding the above-mentioned benefits we would stand to lose by
having clear and documented hints, perhaps we could occasionally
solicit input on where people are finding hints useful to get ideas
on where we might want to improve the optimizer. As far as worrying
about people using hints to force a plan which is sub-optimal --
isn't that getting into nanny mode a bit too much?

-Kevin


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hints (was Poor performance using CTE)

2012-11-21 Thread Heikki Linnakangas

On 21.11.2012 15:42, Kevin Grittner wrote:

Better, IMV, would be to identify what sorts of hints people actually
find useful, and use that as the basis for TODO items for optimizer
improvement as well as inventing clear ways to specify the desired
coercion. I liked the suggestion that a CTE which didn't need to be
materialized because of side-effects or multiple references have a
keyword. Personally, I think that AS MATERIALIZED x (SELECT ...)
would be preferable to AS x (SELECT ... OFFSET 0) as the syntax to
specify that.


Rather than telling the planner what to do or not to do, I'd much rather 
have hints that give the planner more information about the tables and 
quals involved in the query. A typical source of bad plans is when the 
planner gets its cost estimates wrong. So rather than telling the 
planner to use a nested loop join for a INNER JOIN b ON a.id = b.id, 
the user could tell the planner that there are only 10 rows that match 
the a.id = b.id qual. That gives the planner the information it needs 
to choose the right plan on its own. That kind of hints would be much 
less implementation specific and much more likely to still be useful, or 
at least not outright counter-productive, in a future version with a 
smarter planner.


You could also attach that kind of hints to tables and columns, which 
would be more portable and nicer than decorating all queries.


- Heikki


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hints (was Poor performance using CTE)

2012-11-21 Thread Craig James
On Wed, Nov 21, 2012 at 5:42 AM, Kevin Grittner kgri...@mail.com wrote:


 It's a tough problem. Disguising and not documenting the available
 optimizer hints leads to more reports on where the optimizer should
 be smarter, and has spurred optimizer improvements. ...
 Regarding the above-mentioned benefits we would stand to lose by
 having clear and documented hints, perhaps we could occasionally
 solicit input on where people are finding hints useful to get ideas
 on where we might want to improve the optimizer. As far as worrying
 about people using hints to force a plan which is sub-optimal --
 isn't that getting into nanny mode a bit too much?


Toward that end, the hint documentation (which is almost always viewed as
HTML) could be prefaced by a strong suggestion to post performance
questions in this group first, with links to the subscribe page and the
how to report performance problems FAQ. The hint documentation could even
be minimalistic; suggest to developers that they should post their
problematic queries here before resorting to hints.  That would give the
experts an opportunity to provide the normal advice.  The correct hint
syntax would be suggested only when all other avenues failed.

Craig James



 -Kevin



Re: [PERFORM] Hints (was Poor performance using CTE)

2012-11-21 Thread Joe Conway
On 11/21/2012 08:05 AM, Heikki Linnakangas wrote:
 Rather than telling the planner what to do or not to do, I'd much rather
 have hints that give the planner more information about the tables and
 quals involved in the query. A typical source of bad plans is when the
 planner gets its cost estimates wrong. So rather than telling the
 planner to use a nested loop join for a INNER JOIN b ON a.id = b.id,
 the user could tell the planner that there are only 10 rows that match
 the a.id = b.id qual. That gives the planner the information it needs
 to choose the right plan on its own. That kind of hints would be much
 less implementation specific and much more likely to still be useful, or
 at least not outright counter-productive, in a future version with a
 smarter planner.
 
 You could also attach that kind of hints to tables and columns, which
 would be more portable and nicer than decorating all queries.

I like this idea, but also think that if we have a syntax to allow
hints, it would be nice to have a simple way to ignore all hints (yes, I
suppose I'm suggesting yet another GUC). That way after sprinkling your
SQL with hints, you could easily periodically (e.g. after a Postgres
upgrade) test what would happen if the hints were removed.

Joe
-- 
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting,  24x7 Support




-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hints (was Poor performance using CTE)

2012-11-21 Thread Craig James
On Wed, Nov 21, 2012 at 9:25 AM, Joe Conway m...@joeconway.com wrote:

 On 11/21/2012 08:05 AM, Heikki Linnakangas wrote:
  Rather than telling the planner what to do or not to do, I'd much rather
  have hints that give the planner more information about the tables and
  quals involved in the query. A typical source of bad plans is when the
  planner gets its cost estimates wrong. So rather than telling the
  planner to use a nested loop join for a INNER JOIN b ON a.id = b.id,
  the user could tell the planner that there are only 10 rows that match
  the a.id = b.id qual. That gives the planner the information it needs
  to choose the right plan on its own. That kind of hints would be much
  less implementation specific and much more likely to still be useful, or
  at least not outright counter-productive, in a future version with a
  smarter planner.
 
  You could also attach that kind of hints to tables and columns, which
  would be more portable and nicer than decorating all queries.

 I like this idea, but also think that if we have a syntax to allow
 hints, it would be nice to have a simple way to ignore all hints (yes, I
 suppose I'm suggesting yet another GUC). That way after sprinkling your
 SQL with hints, you could easily periodically (e.g. after a Postgres
 upgrade) test what would happen if the hints were removed.


Or a three-way choice: Allow, ignore, or generate an error.  That would
allow developers to identify where hints are being used.

Craig



 Joe
 --
 Joe Conway
 credativ LLC: http://www.credativ.us
 Linux, PostgreSQL, and general Open Source
 Training, Service, Consulting,  24x7 Support





Re: [PERFORM] Hints (was Poor performance using CTE)

2012-11-21 Thread Joe Conway
On 11/21/2012 09:28 AM, Craig James wrote:
 
 
 On Wed, Nov 21, 2012 at 9:25 AM, Joe Conway m...@joeconway.com
 mailto:m...@joeconway.com wrote:
 
 On 11/21/2012 08:05 AM, Heikki Linnakangas wrote:
  Rather than telling the planner what to do or not to do, I'd much
 rather
  have hints that give the planner more information about the tables and
  quals involved in the query. A typical source of bad plans is when the
  planner gets its cost estimates wrong. So rather than telling the
  planner to use a nested loop join for a INNER JOIN b ON a.id
 http://a.id = b.id http://b.id,
  the user could tell the planner that there are only 10 rows that match
  the a.id http://a.id = b.id http://b.id qual. That gives the
 planner the information it needs
  to choose the right plan on its own. That kind of hints would be much
  less implementation specific and much more likely to still be
 useful, or
  at least not outright counter-productive, in a future version with a
  smarter planner.
 
  You could also attach that kind of hints to tables and columns, which
  would be more portable and nicer than decorating all queries.
 
 I like this idea, but also think that if we have a syntax to allow
 hints, it would be nice to have a simple way to ignore all hints (yes, I
 suppose I'm suggesting yet another GUC). That way after sprinkling your
 SQL with hints, you could easily periodically (e.g. after a Postgres
 upgrade) test what would happen if the hints were removed.
 
 
 Or a three-way choice: Allow, ignore, or generate an error.  That would
 allow developers to identify where hints are being used.

+1

Joe


-- 
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting,  24x7 Support




-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hints (was Poor performance using CTE)

2012-11-21 Thread Tom Lane
Craig James cja...@emolecules.com writes:
 On Wed, Nov 21, 2012 at 9:25 AM, Joe Conway m...@joeconway.com wrote:
 I like this idea, but also think that if we have a syntax to allow
 hints, it would be nice to have a simple way to ignore all hints (yes, I
 suppose I'm suggesting yet another GUC). That way after sprinkling your
 SQL with hints, you could easily periodically (e.g. after a Postgres
 upgrade) test what would happen if the hints were removed.

 Or a three-way choice: Allow, ignore, or generate an error.  That would
 allow developers to identify where hints are being used.

Throwing errors would likely prevent you from reaching all parts of your
application, thus preventing complete testing.  Much more sensible to
just log such queries.

regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hints (was Poor performance using CTE)

2012-11-21 Thread Andreas Kretschmer
Craig James cja...@emolecules.com wrote:

 On 11/21/2012 08:05 AM, Heikki Linnakangas wrote:
  Rather than telling the planner what to do or not to do, I'd much rather
  have hints that give the planner more information about the tables and
  quals involved in the query. A typical source of bad plans is when the
  planner gets its cost estimates wrong. So rather than telling the
  planner to use a nested loop join for a INNER JOIN b ON a.id = b.id,
  the user could tell the planner that there are only 10 rows that match
  the a.id = b.id qual. That gives the planner the information it needs
  to choose the right plan on its own. That kind of hints would be much
  less implementation specific and much more likely to still be useful, or
  at least not outright counter-productive, in a future version with a
  smarter planner.
 
  You could also attach that kind of hints to tables and columns, which
  would be more portable and nicer than decorating all queries.
 
 I like this idea, but also think that if we have a syntax to allow
 hints, it would be nice to have a simple way to ignore all hints (yes, I
 suppose I'm suggesting yet another GUC). That way after sprinkling your
 SQL with hints, you could easily periodically (e.g. after a Postgres
 upgrade) test what would happen if the hints were removed.
 
 
 Or a three-way choice: Allow, ignore, or generate an error.  That would allow
 developers to identify where hints are being used.
 
 Craig

+1

I think, we HAVE a smart planner, but hints in this direction are okay,
and we need a simple way to make such hints obsolete - for/in the future. 


Andreas
-- 
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect.  (Linus Torvalds)
If I was god, I would recompile penguin with --enable-fly.   (unknown)
Kaufbach, Saxony, Germany, Europe.  N 51.05082°, E 13.56889°


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hints (was Poor performance using CTE)

2012-11-21 Thread Jeff Janes
On Wed, Nov 21, 2012 at 8:05 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 On 21.11.2012 15:42, Kevin Grittner wrote:

 Better, IMV, would be to identify what sorts of hints people actually
 find useful, and use that as the basis for TODO items for optimizer
 improvement as well as inventing clear ways to specify the desired
 coercion. I liked the suggestion that a CTE which didn't need to be
 materialized because of side-effects or multiple references have a
 keyword. Personally, I think that AS MATERIALIZED x (SELECT ...)
 would be preferable to AS x (SELECT ... OFFSET 0) as the syntax to
 specify that.


 Rather than telling the planner what to do or not to do, I'd much rather
 have hints that give the planner more information about the tables and quals
 involved in the query. A typical source of bad plans is when the planner
 gets its cost estimates wrong. So rather than telling the planner to use a
 nested loop join for a INNER JOIN b ON a.id = b.id, the user could tell
 the planner that there are only 10 rows that match the a.id = b.id qual.

For each a.id there are 10 b.id, or for each b.id there are 10 a.id?

 That gives the planner the information it needs to choose the right plan on
 its own. That kind of hints would be much less implementation specific and
 much more likely to still be useful, or at least not outright
 counter-productive, in a future version with a smarter planner.

When I run into unexpectedly poor performance, I have an intuitive
enough feel for my own data that I know what plan it ought to be
using.  Figuring out why it is not using it is very hard.  For one
thing, EXPLAIN tells you about the winning plan, but there is no
visibility into what ought to be the winning plan but isn't, so no way
to see why it isn't.So you first have to use our existing non-hint
hints (enable_*, doing weird things with cost_*, CTE stuff) to trick
it into using the plan I want it to use, before I can figure out why
it isn't using it, before I could figure out what hints of the style
you are suggesting to supply to get it to use it.

So I think the type of hints you are suggesting would be about as hard
for the user to use as debugging the planner for the particular case
would be.  While the more traditional type of hint is easy to use,
because the end user understands their data more than they understand
the guts of the planner.


Cheers,

Jeff


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Hints (was Poor performance using CTE)

2012-11-20 Thread Craig James
On Tue, Nov 20, 2012 at 3:53 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Craig Ringer cr...@2ndquadrant.com writes:
  On 11/21/2012 12:06 AM, Claudio Freire wrote:
  I meant for postgres to do automatically. Rewriting as a join wouldn't
  work as an optimization fence the way we're used to, but pushing
  constraints upwards can only help (especially if highly selective).

  Because people are now used to using CTEs as query hints, it'd probably
  cause performance regressions in working queries. Perhaps more
  importantly, Pg would have to prove that doing so didn't change queries
  that invoked functions with side-effects to avoid changing the results
  of currently valid queries.

 We could trivially arrange to keep the current semantics if the CTE
 query contains any volatile functions (or of course if it's
 INSERT/UPDATE/DELETE).  I think we'd also need to not optimize if
 it's invoked from more than one place in the outer query.

 I think the more interesting question is what cases wouldn't be covered
 by such a rule.  Typically you need to use OFFSET 0 in situations where
 the planner has guessed wrong about costs or rowcounts, and I think
 people are likely using WITH for that as well.  Should we be telling
 people that they ought to insert OFFSET 0 in WITH queries if they want
 to be sure there's an optimization fence?


I'm probably beating a dead horse ... but isn't this just a hint?  Except
that it's worse than a hint, because it's a hint in disguise and is
undocumented.  As far as I can tell, there's no use for OFFSET 0 except
to act as an optimizer fence.

It's clearly an important need, given the nature of the dialog above (and
many others that have passed through this mailing list).

Why not make an explicit hint syntax and document it? I've still don't
understand why hint is a dirty word in Postgres.  There are a half-dozen
or so ways in common use to circumvent or correct sub-optimal plans.

Craig James



 regards, tom lane


 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance



Re: [PERFORM] Hints (was Poor performance using CTE)

2012-11-20 Thread Craig Ringer
On 11/21/2012 09:35 AM, Craig James wrote:
 Why not make an explicit hint syntax and document it? I've still don't
 understand why hint is a dirty word in Postgres.  There are a
 half-dozen or so ways in common use to circumvent or correct
 sub-optimal plans.


The reason usually given is that hints provide easy workarounds for
planner and stats issues, so people don't report problems or fix the
underlying problem.

Of course, if that's all there was to it, `OFFSET 0` would be made into
an error or warning, or ignored and not fenced.

The reality is, as you say, that there's a need, because the planner can
never be perfect - or rather, if it were nearly perfect, it'd take so
long to read the stats and calculate plans that everything would be
glacially slow anyway. The planner has to compromise, and so cases will
always arise where it needs a little help.

I think it's time to admit that and get the syntax in place for CTEs so
there's room to optimize them later, rather than cementing
CTEs-as-fences in forever as a Pg quirk.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance