Re: [PERFORM] Hints (was Poor performance using CTE)

2012-12-03 Thread Gavin Flower

On 28/11/12 15:17, Craig Ringer wrote:

On 27/11/2012 3:42 PM, Scott Marlowe wrote:


Here here!  PostgreSQL is well known for its extensibility and this is
the perfect place for hints.
I agree with the sentiment and your concerns. However, this doesn't 
solve the CTE problem.


Some people are relying on the planner's inability to push conditions 
into / pull conditions out of CTEs, and otherwise re-arrange them. If 
support for optimising into eligible CTEs (ie CTE terms that contain 
only SELECT or VALUES and call no VOLATILE functions) then these 
applications will potentially encounter serious performance regressions.


Should this feature never be added to Pg, making it different and 
incompatible with other DBs that implement CTE optimisation, just 
because some people are using it for a hacky hint like OFFSET 0?


Should these applications just be broken by the update, with people 
told to add `OFFSET 0` or load some not-yet-existing hints module 
after reporting the performance issue to the list?


I don't think either of those are acceptable. Sooner or later 
somebody's going to want to add CTE optimisation, and I don't think 
that "you can't" or "great, we'll do it and break everything" are 
acceptable responses to any proposed patch someone might come up with 
to add that.


A GUC might be OK, as apps can always SET it before problem queries or 
not-yet-ported code. It'd probably reduce the rate at which people 
fixed their code considerably, though, going by past experience with 
standard_conforming_strings, etc, but it'd work.


--
Craig Ringer



I think it would be best to be something in the SQL for SELECT, as:

1. One is more likely to find it by looking up the documentation for SELECT

2. It could allow selective application within a SELECT: one could have
   several queries within the WITH clause: where all except one might
   benefit for optimisation, and the exception might cause problems

I have suggested a couple possible syntax paterns, but there may well be 
better alternative syntaxes.



Cheers,
Gavin


Re: [PERFORM] Hints (was Poor performance using CTE)

2012-11-27 Thread Scott Marlowe
On Tue, Nov 27, 2012 at 7:17 PM, Craig Ringer  wrote:
> On 27/11/2012 3:42 PM, Scott Marlowe wrote:
>
>> Here here!  PostgreSQL is well known for its extensibility and this is
>> the perfect place for hints.
>
> I agree with the sentiment and your concerns. However, this doesn't solve
> the CTE problem.
>
> Some people are relying on the planner's inability to push conditions into /
> pull conditions out of CTEs, and otherwise re-arrange them. If support for
> optimising into eligible CTEs (ie CTE terms that contain only SELECT or
> VALUES and call no VOLATILE functions) then these applications will
> potentially encounter serious performance regressions.
>
> Should this feature never be added to Pg, making it different and
> incompatible with other DBs that implement CTE optimisation, just because
> some people are using it for a hacky hint like OFFSET 0?

I'm strictly talking about any hinting mechanism being added being an
extension.  Fixing the planner so that optimizations can get cross the
CTE boundary seems the domain of back end hackers, not extensions.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hints (was Poor performance using CTE)

2012-11-27 Thread Craig Ringer

On 27/11/2012 3:42 PM, Scott Marlowe wrote:


Here here!  PostgreSQL is well known for its extensibility and this is
the perfect place for hints.
I agree with the sentiment and your concerns. However, this doesn't 
solve the CTE problem.


Some people are relying on the planner's inability to push conditions 
into / pull conditions out of CTEs, and otherwise re-arrange them. If 
support for optimising into eligible CTEs (ie CTE terms that contain 
only SELECT or VALUES and call no VOLATILE functions) then these 
applications will potentially encounter serious performance regressions.


Should this feature never be added to Pg, making it different and 
incompatible with other DBs that implement CTE optimisation, just 
because some people are using it for a hacky hint like OFFSET 0?


Should these applications just be broken by the update, with people told 
to add `OFFSET 0` or load some not-yet-existing hints module after 
reporting the performance issue to the list?


I don't think either of those are acceptable. Sooner or later somebody's 
going to want to add CTE optimisation, and I don't think that "you 
can't" or "great, we'll do it and break everything" are acceptable 
responses to any proposed patch someone might come up with to add that.


A GUC might be OK, as apps can always SET it before problem queries or 
not-yet-ported code. It'd probably reduce the rate at which people fixed 
their code considerably, though, going by past experience with 
standard_conforming_strings, etc, but it'd work.


--
Craig Ringer


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hints (was Poor performance using CTE)

2012-11-27 Thread Scott Marlowe
On Fri, Nov 23, 2012 at 3:05 AM, Cédric Villemain
 wrote:
> Le mercredi 21 novembre 2012 17:34:02, Craig James a écrit :
>> On Wed, Nov 21, 2012 at 5:42 AM, Kevin Grittner  wrote:
>> > It's a tough problem. Disguising and not documenting the available
>> > optimizer hints leads to more reports on where the optimizer should
>> > be smarter, and has spurred optimizer improvements. ...
>> > Regarding the above-mentioned benefits we would stand to lose by
>> > having clear and documented hints, perhaps we could occasionally
>> > solicit input on where people are finding hints useful to get ideas
>> > on where we might want to improve the optimizer. As far as worrying
>> > about people using hints to force a plan which is sub-optimal --
>> > isn't that getting into nanny mode a bit too much?
>>
>> Toward that end, the hint documentation (which is almost always viewed as
>> HTML) could be prefaced by a strong suggestion to post performance
>> questions in this group first, with links to the "subscribe" page and the
>> "how to report performance problems" FAQ. The hint documentation could even
>> be minimalistic; suggest to developers that they should post their
>> problematic queries here before resorting to hints.  That would give the
>> experts an opportunity to provide the normal advice.  The correct hint
>> syntax would be suggested only when all other avenues failed.
>
> We have hooks in PostgreSQL. We already have at least one extension which is
> using that to change the planner behavior.
>
> We can have a bit more hooks and try to improve the cost estimate, this part
> of the code is known to be built by reports and human estimations, also the
> 9.2 version got heavy modifications in this area.
>
> Let the 'Hints' be inside an extension thus we are able to track them and fix
> the planner/costestimate issues.
>
> I don't see why PostgreSQL needs 'Hints' *in-core*.

Here here!  PostgreSQL is well known for its extensibility and this is
the perfect place for hints.  That way they can get worked on without
becoming a crutch for every user and forcing the backend developers to
support what may or may not be a good idea syntax wise.  After a few
different people have banged some code out to make workable hint
syntaxes for their own use maybe then it will be time to revisit
adding hints to core.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hints (was Poor performance using CTE)

2012-11-23 Thread Gavin Flower

On 22/11/12 06:28, Craig James wrote:



On Wed, Nov 21, 2012 at 9:25 AM, Joe Conway > wrote:


On 11/21/2012 08:05 AM, Heikki Linnakangas wrote:
> Rather than telling the planner what to do or not to do, I'd
much rather
> have hints that give the planner more information about the
tables and
> quals involved in the query. A typical source of bad plans is
when the
> planner gets its cost estimates wrong. So rather than telling the
> planner to use a nested loop join for "a INNER JOIN b ON a.id
 = b.id ",
> the user could tell the planner that there are only 10 rows that
match
> the "a.id  = b.id " qual. That gives
the planner the information it needs
> to choose the right plan on its own. That kind of hints would be
much
> less implementation specific and much more likely to still be
useful, or
> at least not outright counter-productive, in a future version with a
> smarter planner.
>
> You could also attach that kind of hints to tables and columns,
which
> would be more portable and nicer than decorating all queries.

I like this idea, but also think that if we have a syntax to allow
hints, it would be nice to have a simple way to ignore all hints
(yes, I
suppose I'm suggesting yet another GUC). That way after sprinkling
your
SQL with hints, you could easily periodically (e.g. after a Postgres
upgrade) test what would happen if the hints were removed.


Or a three-way choice: Allow, ignore, or generate an error. That would 
allow developers to identify where hints are being used.


Craig


Joe
--
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting, & 24x7 Support



Or perhaps hints should have the pg version attached, so that they are 
automatically ignored when the pg version changed?  Problem may then 
become people reluctant to upgrade because their hints relate to a 
previous version!  Sigh...


Even requiring registration of hints and expiring them after a limited 
time period would not work - as people would simply automate the process 
of registration & application...



Cheers,
Gavin


Re: [PERFORM] Hints (was Poor performance using CTE)

2012-11-23 Thread Cédric Villemain
Le mercredi 21 novembre 2012 17:34:02, Craig James a écrit :
> On Wed, Nov 21, 2012 at 5:42 AM, Kevin Grittner  wrote:
> > It's a tough problem. Disguising and not documenting the available
> > optimizer hints leads to more reports on where the optimizer should
> > be smarter, and has spurred optimizer improvements. ...
> > Regarding the above-mentioned benefits we would stand to lose by
> > having clear and documented hints, perhaps we could occasionally
> > solicit input on where people are finding hints useful to get ideas
> > on where we might want to improve the optimizer. As far as worrying
> > about people using hints to force a plan which is sub-optimal --
> > isn't that getting into nanny mode a bit too much?
> 
> Toward that end, the hint documentation (which is almost always viewed as
> HTML) could be prefaced by a strong suggestion to post performance
> questions in this group first, with links to the "subscribe" page and the
> "how to report performance problems" FAQ. The hint documentation could even
> be minimalistic; suggest to developers that they should post their
> problematic queries here before resorting to hints.  That would give the
> experts an opportunity to provide the normal advice.  The correct hint
> syntax would be suggested only when all other avenues failed.

We have hooks in PostgreSQL. We already have at least one extension which is 
using that to change the planner behavior.

We can have a bit more hooks and try to improve the cost estimate, this part 
of the code is known to be built by reports and human estimations, also the 
9.2 version got heavy modifications in this area. 

Let the 'Hints' be inside an extension thus we are able to track them and fix 
the planner/costestimate issues.

I don't see why PostgreSQL needs 'Hints' *in-core*.
-- 
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation


signature.asc
Description: This is a digitally signed message part.


Re: [PERFORM] Hints (was Poor performance using CTE)

2012-11-22 Thread Heikki Linnakangas

On 22.11.2012 02:53, Jeff Janes wrote:

That gives the planner the information it needs to choose the right plan on
its own. That kind of hints would be much less implementation specific and
much more likely to still be useful, or at least not outright
counter-productive, in a future version with a smarter planner.


When I run into unexpectedly poor performance, I have an intuitive
enough feel for my own data that I know what plan it ought to be
using.  Figuring out why it is not using it is very hard.  For one
thing, EXPLAIN tells you about the "winning" plan, but there is no
visibility into what ought to be the winning plan but isn't, so no way
to see why it isn't.So you first have to use our existing non-hint
hints (enable_*, doing weird things with cost_*, CTE stuff) to trick
it into using the plan I want it to use, before I can figure out why
it isn't using it, before I could figure out what hints of the style
you are suggesting to supply to get it to use it.


I'm sure that happens too, but my gut feeling is that more often the 
EXPLAIN ANALYZE output reveals a bad estimate somewhere in the plan, and 
the planner chooses a bad plan based on the bad estimate. If you hint 
the planner by giving a better estimate for where the estimator got it 
wrong, the planner will choose the desired plan.


- Heikki


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hints (was Poor performance using CTE)

2012-11-21 Thread Jeff Janes
On Wed, Nov 21, 2012 at 8:05 AM, Heikki Linnakangas
 wrote:
> On 21.11.2012 15:42, Kevin Grittner wrote:
>>
>> Better, IMV, would be to identify what sorts of hints people actually
>> find useful, and use that as the basis for TODO items for optimizer
>> improvement as well as inventing clear ways to specify the desired
>> coercion. I liked the suggestion that a CTE which didn't need to be
>> materialized because of side-effects or multiple references have a
>> keyword. Personally, I think that AS MATERIALIZED x (SELECT ...)
>> would be preferable to AS x (SELECT ... OFFSET 0) as the syntax to
>> specify that.
>
>
> Rather than telling the planner what to do or not to do, I'd much rather
> have hints that give the planner more information about the tables and quals
> involved in the query. A typical source of bad plans is when the planner
> gets its cost estimates wrong. So rather than telling the planner to use a
> nested loop join for "a INNER JOIN b ON a.id = b.id", the user could tell
> the planner that there are only 10 rows that match the "a.id = b.id" qual.

For each a.id there are 10 b.id, or for each b.id there are 10 a.id?

> That gives the planner the information it needs to choose the right plan on
> its own. That kind of hints would be much less implementation specific and
> much more likely to still be useful, or at least not outright
> counter-productive, in a future version with a smarter planner.

When I run into unexpectedly poor performance, I have an intuitive
enough feel for my own data that I know what plan it ought to be
using.  Figuring out why it is not using it is very hard.  For one
thing, EXPLAIN tells you about the "winning" plan, but there is no
visibility into what ought to be the winning plan but isn't, so no way
to see why it isn't.So you first have to use our existing non-hint
hints (enable_*, doing weird things with cost_*, CTE stuff) to trick
it into using the plan I want it to use, before I can figure out why
it isn't using it, before I could figure out what hints of the style
you are suggesting to supply to get it to use it.

So I think the type of hints you are suggesting would be about as hard
for the user to use as debugging the planner for the particular case
would be.  While the more traditional type of hint is easy to use,
because the end user understands their data more than they understand
the guts of the planner.


Cheers,

Jeff


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hints (was Poor performance using CTE)

2012-11-21 Thread Andreas Kretschmer
Craig James  wrote:

> On 11/21/2012 08:05 AM, Heikki Linnakangas wrote:
> > Rather than telling the planner what to do or not to do, I'd much rather
> > have hints that give the planner more information about the tables and
> > quals involved in the query. A typical source of bad plans is when the
> > planner gets its cost estimates wrong. So rather than telling the
> > planner to use a nested loop join for "a INNER JOIN b ON a.id = b.id",
> > the user could tell the planner that there are only 10 rows that match
> > the "a.id = b.id" qual. That gives the planner the information it needs
> > to choose the right plan on its own. That kind of hints would be much
> > less implementation specific and much more likely to still be useful, or
> > at least not outright counter-productive, in a future version with a
> > smarter planner.
> >
> > You could also attach that kind of hints to tables and columns, which
> > would be more portable and nicer than decorating all queries.
> 
> I like this idea, but also think that if we have a syntax to allow
> hints, it would be nice to have a simple way to ignore all hints (yes, I
> suppose I'm suggesting yet another GUC). That way after sprinkling your
> SQL with hints, you could easily periodically (e.g. after a Postgres
> upgrade) test what would happen if the hints were removed.
> 
> 
> Or a three-way choice: Allow, ignore, or generate an error.  That would allow
> developers to identify where hints are being used.
> 
> Craig

+1

I think, we HAVE a smart planner, but hints in this direction are okay,
and we need a simple way to make such hints obsolete - for/in the future. 


Andreas
-- 
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect.  (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly."   (unknown)
Kaufbach, Saxony, Germany, Europe.  N 51.05082°, E 13.56889°


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hints (was Poor performance using CTE)

2012-11-21 Thread Tom Lane
Craig James  writes:
> On Wed, Nov 21, 2012 at 9:25 AM, Joe Conway  wrote:
>> I like this idea, but also think that if we have a syntax to allow
>> hints, it would be nice to have a simple way to ignore all hints (yes, I
>> suppose I'm suggesting yet another GUC). That way after sprinkling your
>> SQL with hints, you could easily periodically (e.g. after a Postgres
>> upgrade) test what would happen if the hints were removed.

> Or a three-way choice: Allow, ignore, or generate an error.  That would
> allow developers to identify where hints are being used.

Throwing errors would likely prevent you from reaching all parts of your
application, thus preventing complete testing.  Much more sensible to
just log such queries.

regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hints (was Poor performance using CTE)

2012-11-21 Thread Joe Conway
On 11/21/2012 09:28 AM, Craig James wrote:
> 
> 
> On Wed, Nov 21, 2012 at 9:25 AM, Joe Conway  > wrote:
> 
> On 11/21/2012 08:05 AM, Heikki Linnakangas wrote:
> > Rather than telling the planner what to do or not to do, I'd much
> rather
> > have hints that give the planner more information about the tables and
> > quals involved in the query. A typical source of bad plans is when the
> > planner gets its cost estimates wrong. So rather than telling the
> > planner to use a nested loop join for "a INNER JOIN b ON a.id
>  = b.id ",
> > the user could tell the planner that there are only 10 rows that match
> > the "a.id  = b.id " qual. That gives the
> planner the information it needs
> > to choose the right plan on its own. That kind of hints would be much
> > less implementation specific and much more likely to still be
> useful, or
> > at least not outright counter-productive, in a future version with a
> > smarter planner.
> >
> > You could also attach that kind of hints to tables and columns, which
> > would be more portable and nicer than decorating all queries.
> 
> I like this idea, but also think that if we have a syntax to allow
> hints, it would be nice to have a simple way to ignore all hints (yes, I
> suppose I'm suggesting yet another GUC). That way after sprinkling your
> SQL with hints, you could easily periodically (e.g. after a Postgres
> upgrade) test what would happen if the hints were removed.
> 
> 
> Or a three-way choice: Allow, ignore, or generate an error.  That would
> allow developers to identify where hints are being used.

+1

Joe


-- 
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting, & 24x7 Support




-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hints (was Poor performance using CTE)

2012-11-21 Thread Craig James
On Wed, Nov 21, 2012 at 9:25 AM, Joe Conway  wrote:

> On 11/21/2012 08:05 AM, Heikki Linnakangas wrote:
> > Rather than telling the planner what to do or not to do, I'd much rather
> > have hints that give the planner more information about the tables and
> > quals involved in the query. A typical source of bad plans is when the
> > planner gets its cost estimates wrong. So rather than telling the
> > planner to use a nested loop join for "a INNER JOIN b ON a.id = b.id",
> > the user could tell the planner that there are only 10 rows that match
> > the "a.id = b.id" qual. That gives the planner the information it needs
> > to choose the right plan on its own. That kind of hints would be much
> > less implementation specific and much more likely to still be useful, or
> > at least not outright counter-productive, in a future version with a
> > smarter planner.
> >
> > You could also attach that kind of hints to tables and columns, which
> > would be more portable and nicer than decorating all queries.
>
> I like this idea, but also think that if we have a syntax to allow
> hints, it would be nice to have a simple way to ignore all hints (yes, I
> suppose I'm suggesting yet another GUC). That way after sprinkling your
> SQL with hints, you could easily periodically (e.g. after a Postgres
> upgrade) test what would happen if the hints were removed.
>

Or a three-way choice: Allow, ignore, or generate an error.  That would
allow developers to identify where hints are being used.

Craig


>
> Joe
> --
> Joe Conway
> credativ LLC: http://www.credativ.us
> Linux, PostgreSQL, and general Open Source
> Training, Service, Consulting, & 24x7 Support
>
>
>


Re: [PERFORM] Hints (was Poor performance using CTE)

2012-11-21 Thread Joe Conway
On 11/21/2012 08:05 AM, Heikki Linnakangas wrote:
> Rather than telling the planner what to do or not to do, I'd much rather
> have hints that give the planner more information about the tables and
> quals involved in the query. A typical source of bad plans is when the
> planner gets its cost estimates wrong. So rather than telling the
> planner to use a nested loop join for "a INNER JOIN b ON a.id = b.id",
> the user could tell the planner that there are only 10 rows that match
> the "a.id = b.id" qual. That gives the planner the information it needs
> to choose the right plan on its own. That kind of hints would be much
> less implementation specific and much more likely to still be useful, or
> at least not outright counter-productive, in a future version with a
> smarter planner.
> 
> You could also attach that kind of hints to tables and columns, which
> would be more portable and nicer than decorating all queries.

I like this idea, but also think that if we have a syntax to allow
hints, it would be nice to have a simple way to ignore all hints (yes, I
suppose I'm suggesting yet another GUC). That way after sprinkling your
SQL with hints, you could easily periodically (e.g. after a Postgres
upgrade) test what would happen if the hints were removed.

Joe
-- 
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting, & 24x7 Support




-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hints (was Poor performance using CTE)

2012-11-21 Thread Craig James
On Wed, Nov 21, 2012 at 5:42 AM, Kevin Grittner  wrote:

>
> It's a tough problem. Disguising and not documenting the available
> optimizer hints leads to more reports on where the optimizer should
> be smarter, and has spurred optimizer improvements. ...
> Regarding the above-mentioned benefits we would stand to lose by
> having clear and documented hints, perhaps we could occasionally
> solicit input on where people are finding hints useful to get ideas
> on where we might want to improve the optimizer. As far as worrying
> about people using hints to force a plan which is sub-optimal --
> isn't that getting into nanny mode a bit too much?
>

Toward that end, the hint documentation (which is almost always viewed as
HTML) could be prefaced by a strong suggestion to post performance
questions in this group first, with links to the "subscribe" page and the
"how to report performance problems" FAQ. The hint documentation could even
be minimalistic; suggest to developers that they should post their
problematic queries here before resorting to hints.  That would give the
experts an opportunity to provide the normal advice.  The correct hint
syntax would be suggested only when all other avenues failed.

Craig James


>
> -Kevin
>


Re: [PERFORM] Hints (was Poor performance using CTE)

2012-11-21 Thread Heikki Linnakangas

On 21.11.2012 15:42, Kevin Grittner wrote:

Better, IMV, would be to identify what sorts of hints people actually
find useful, and use that as the basis for TODO items for optimizer
improvement as well as inventing clear ways to specify the desired
coercion. I liked the suggestion that a CTE which didn't need to be
materialized because of side-effects or multiple references have a
keyword. Personally, I think that AS MATERIALIZED x (SELECT ...)
would be preferable to AS x (SELECT ... OFFSET 0) as the syntax to
specify that.


Rather than telling the planner what to do or not to do, I'd much rather 
have hints that give the planner more information about the tables and 
quals involved in the query. A typical source of bad plans is when the 
planner gets its cost estimates wrong. So rather than telling the 
planner to use a nested loop join for "a INNER JOIN b ON a.id = b.id", 
the user could tell the planner that there are only 10 rows that match 
the "a.id = b.id" qual. That gives the planner the information it needs 
to choose the right plan on its own. That kind of hints would be much 
less implementation specific and much more likely to still be useful, or 
at least not outright counter-productive, in a future version with a 
smarter planner.


You could also attach that kind of hints to tables and columns, which 
would be more portable and nicer than decorating all queries.


- Heikki


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hints (was Poor performance using CTE)

2012-11-21 Thread Kevin Grittner
Craig Ringer wrote:
> On 11/21/2012 09:35 AM, Craig James wrote:
>> Why not make an explicit hint syntax and document it? I've still
>> don't understand why "hint" is a dirty word in Postgres. There are
>> a half-dozen or so ways in common use to circumvent or correct
>> sub-optimal plans.
> 
> The reason usually given is that hints provide easy workarounds for
> planner and stats issues, so people don't report problems or fix
> the underlying problem.
> 
> Of course, if that's all there was to it, `OFFSET 0` would be made
> into an error or warning, or ignored and not fenced.
> 
> The reality is, as you say, that there's a need, because the
> planner can never be perfect - or rather, if it were nearly
> perfect, it'd take so long to read the stats and calculate plans
> that everything would be glacially slow anyway. The planner has to
> compromise, and so cases will always arise where it needs a little
> help.
> 
> I think it's time to admit that and get the syntax in place for
> CTEs so there's room to optimize them later, rather than cementing
> CTEs-as-fences in forever as a Pg quirk.

It's a tough problem. Disguising and not documenting the available
optimizer hints leads to more reports on where the optimizer should
be smarter, and has spurred optimizer improvements. And many type of
hints would undoubtedly cause people to force what they *think* would
be the best plan in many cases where they are wrong, or become wrong
as data scales up. But it does seem odd every time I hear people
saying that they don't want to eliminate some optimization fence
because "they find it useful" while simultaneously arguing that we
don't have or want hints.

Having a way to coerce the optimizer from the plan it would take with
straightforward coding *is* a hint, and one down-side of hiding the
hints inside syntax mostly supported for other reasons is that people
who don't know about these clever devices can't do reasonable
refactoring of queries for readability without risking performance
regressions. Another down-side is that perfectly reasonable queries
ported from other databases that use hint syntax for hints run afoul
of the secret hints when trying to run queries on PostgreSQL, and get
performance potentially orders of magnitude worse than they expect.

I'm not sure what the best answer is, but as long as we have hints,
but only through OFFSET 0 or CTE usage, that should be documented.
Better, IMV, would be to identify what sorts of hints people actually
find useful, and use that as the basis for TODO items for optimizer
improvement as well as inventing clear ways to specify the desired
coercion. I liked the suggestion that a CTE which didn't need to be
materialized because of side-effects or multiple references have a
keyword. Personally, I think that AS MATERIALIZED x (SELECT ...)
would be preferable to AS x (SELECT ... OFFSET 0) as the syntax to
specify that.

Regarding the above-mentioned benefits we would stand to lose by
having clear and documented hints, perhaps we could occasionally
solicit input on where people are finding hints useful to get ideas
on where we might want to improve the optimizer. As far as worrying
about people using hints to force a plan which is sub-optimal --
isn't that getting into nanny mode a bit too much?

-Kevin


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hints (was Poor performance using CTE)

2012-11-21 Thread Shaun Thomas

On 11/20/2012 08:15 PM, Craig Ringer wrote:


I think it's time to admit that and get the syntax in place for CTEs so
there's room to optimize them later, rather than cementing
CTEs-as-fences in forever as a Pg quirk.


I know I'm just some schmo, but I'd vote for this. I'm certainly guilty 
of using OFFSET 0. Undocumented hints are still hints. As much as I 
think they're a bad idea by cementing a certain plan that may not get 
the benefits of future versions, non-intuitive side-effects by using 
overloaded syntax are worse.


I've been using CTEs as temp tables because I know that's how they work. 
But I'd be more than willing to modify my syntax one way or the other to 
adopt non-materialized CTEs, provided there's some way to get the 
current behavior.


--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
stho...@optionshouse.com

__

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to 
this email


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hints (was Poor performance using CTE)

2012-11-20 Thread Craig Ringer
On 11/21/2012 09:35 AM, Craig James wrote:
> Why not make an explicit hint syntax and document it? I've still don't
> understand why "hint" is a dirty word in Postgres.  There are a
> half-dozen or so ways in common use to circumvent or correct
> sub-optimal plans.
>

The reason usually given is that hints provide easy workarounds for
planner and stats issues, so people don't report problems or fix the
underlying problem.

Of course, if that's all there was to it, `OFFSET 0` would be made into
an error or warning, or ignored and not fenced.

The reality is, as you say, that there's a need, because the planner can
never be perfect - or rather, if it were nearly perfect, it'd take so
long to read the stats and calculate plans that everything would be
glacially slow anyway. The planner has to compromise, and so cases will
always arise where it needs a little help.

I think it's time to admit that and get the syntax in place for CTEs so
there's room to optimize them later, rather than cementing
CTEs-as-fences in forever as a Pg quirk.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance