Re: [PERFORM] Hints (was Poor performance using CTE)
On 28/11/12 15:17, Craig Ringer wrote: On 27/11/2012 3:42 PM, Scott Marlowe wrote: Here here! PostgreSQL is well known for its extensibility and this is the perfect place for hints. I agree with the sentiment and your concerns. However, this doesn't solve the CTE problem. Some people are relying on the planner's inability to push conditions into / pull conditions out of CTEs, and otherwise re-arrange them. If support for optimising into eligible CTEs (ie CTE terms that contain only SELECT or VALUES and call no VOLATILE functions) then these applications will potentially encounter serious performance regressions. Should this feature never be added to Pg, making it different and incompatible with other DBs that implement CTE optimisation, just because some people are using it for a hacky hint like OFFSET 0? Should these applications just be broken by the update, with people told to add `OFFSET 0` or load some not-yet-existing hints module after reporting the performance issue to the list? I don't think either of those are acceptable. Sooner or later somebody's going to want to add CTE optimisation, and I don't think that you can't or great, we'll do it and break everything are acceptable responses to any proposed patch someone might come up with to add that. A GUC might be OK, as apps can always SET it before problem queries or not-yet-ported code. It'd probably reduce the rate at which people fixed their code considerably, though, going by past experience with standard_conforming_strings, etc, but it'd work. -- Craig Ringer I think it would be best to be something in the SQL for SELECT, as: 1. One is more likely to find it by looking up the documentation for SELECT 2. It could allow selective application within a SELECT: one could have several queries within the WITH clause: where all except one might benefit for optimisation, and the exception might cause problems I have suggested a couple possible syntax paterns, but there may well be better alternative syntaxes. Cheers, Gavin
Re: [PERFORM] Hints (was Poor performance using CTE)
On Fri, Nov 23, 2012 at 3:05 AM, Cédric Villemain ced...@2ndquadrant.com wrote: Le mercredi 21 novembre 2012 17:34:02, Craig James a écrit : On Wed, Nov 21, 2012 at 5:42 AM, Kevin Grittner kgri...@mail.com wrote: It's a tough problem. Disguising and not documenting the available optimizer hints leads to more reports on where the optimizer should be smarter, and has spurred optimizer improvements. ... Regarding the above-mentioned benefits we would stand to lose by having clear and documented hints, perhaps we could occasionally solicit input on where people are finding hints useful to get ideas on where we might want to improve the optimizer. As far as worrying about people using hints to force a plan which is sub-optimal -- isn't that getting into nanny mode a bit too much? Toward that end, the hint documentation (which is almost always viewed as HTML) could be prefaced by a strong suggestion to post performance questions in this group first, with links to the subscribe page and the how to report performance problems FAQ. The hint documentation could even be minimalistic; suggest to developers that they should post their problematic queries here before resorting to hints. That would give the experts an opportunity to provide the normal advice. The correct hint syntax would be suggested only when all other avenues failed. We have hooks in PostgreSQL. We already have at least one extension which is using that to change the planner behavior. We can have a bit more hooks and try to improve the cost estimate, this part of the code is known to be built by reports and human estimations, also the 9.2 version got heavy modifications in this area. Let the 'Hints' be inside an extension thus we are able to track them and fix the planner/costestimate issues. I don't see why PostgreSQL needs 'Hints' *in-core*. Here here! PostgreSQL is well known for its extensibility and this is the perfect place for hints. That way they can get worked on without becoming a crutch for every user and forcing the backend developers to support what may or may not be a good idea syntax wise. After a few different people have banged some code out to make workable hint syntaxes for their own use maybe then it will be time to revisit adding hints to core. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Hints (was Poor performance using CTE)
On 27/11/2012 3:42 PM, Scott Marlowe wrote: Here here! PostgreSQL is well known for its extensibility and this is the perfect place for hints. I agree with the sentiment and your concerns. However, this doesn't solve the CTE problem. Some people are relying on the planner's inability to push conditions into / pull conditions out of CTEs, and otherwise re-arrange them. If support for optimising into eligible CTEs (ie CTE terms that contain only SELECT or VALUES and call no VOLATILE functions) then these applications will potentially encounter serious performance regressions. Should this feature never be added to Pg, making it different and incompatible with other DBs that implement CTE optimisation, just because some people are using it for a hacky hint like OFFSET 0? Should these applications just be broken by the update, with people told to add `OFFSET 0` or load some not-yet-existing hints module after reporting the performance issue to the list? I don't think either of those are acceptable. Sooner or later somebody's going to want to add CTE optimisation, and I don't think that you can't or great, we'll do it and break everything are acceptable responses to any proposed patch someone might come up with to add that. A GUC might be OK, as apps can always SET it before problem queries or not-yet-ported code. It'd probably reduce the rate at which people fixed their code considerably, though, going by past experience with standard_conforming_strings, etc, but it'd work. -- Craig Ringer -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Hints (was Poor performance using CTE)
On Tue, Nov 27, 2012 at 7:17 PM, Craig Ringer cr...@2ndquadrant.com wrote: On 27/11/2012 3:42 PM, Scott Marlowe wrote: Here here! PostgreSQL is well known for its extensibility and this is the perfect place for hints. I agree with the sentiment and your concerns. However, this doesn't solve the CTE problem. Some people are relying on the planner's inability to push conditions into / pull conditions out of CTEs, and otherwise re-arrange them. If support for optimising into eligible CTEs (ie CTE terms that contain only SELECT or VALUES and call no VOLATILE functions) then these applications will potentially encounter serious performance regressions. Should this feature never be added to Pg, making it different and incompatible with other DBs that implement CTE optimisation, just because some people are using it for a hacky hint like OFFSET 0? I'm strictly talking about any hinting mechanism being added being an extension. Fixing the planner so that optimizations can get cross the CTE boundary seems the domain of back end hackers, not extensions. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Hints (was Poor performance using CTE)
Le mercredi 21 novembre 2012 17:34:02, Craig James a écrit : On Wed, Nov 21, 2012 at 5:42 AM, Kevin Grittner kgri...@mail.com wrote: It's a tough problem. Disguising and not documenting the available optimizer hints leads to more reports on where the optimizer should be smarter, and has spurred optimizer improvements. ... Regarding the above-mentioned benefits we would stand to lose by having clear and documented hints, perhaps we could occasionally solicit input on where people are finding hints useful to get ideas on where we might want to improve the optimizer. As far as worrying about people using hints to force a plan which is sub-optimal -- isn't that getting into nanny mode a bit too much? Toward that end, the hint documentation (which is almost always viewed as HTML) could be prefaced by a strong suggestion to post performance questions in this group first, with links to the subscribe page and the how to report performance problems FAQ. The hint documentation could even be minimalistic; suggest to developers that they should post their problematic queries here before resorting to hints. That would give the experts an opportunity to provide the normal advice. The correct hint syntax would be suggested only when all other avenues failed. We have hooks in PostgreSQL. We already have at least one extension which is using that to change the planner behavior. We can have a bit more hooks and try to improve the cost estimate, this part of the code is known to be built by reports and human estimations, also the 9.2 version got heavy modifications in this area. Let the 'Hints' be inside an extension thus we are able to track them and fix the planner/costestimate issues. I don't see why PostgreSQL needs 'Hints' *in-core*. -- Cédric Villemain +33 (0)6 20 30 22 52 http://2ndQuadrant.fr/ PostgreSQL: Support 24x7 - Développement, Expertise et Formation signature.asc Description: This is a digitally signed message part.
Re: [PERFORM] Hints (was Poor performance using CTE)
On 22/11/12 06:28, Craig James wrote: On Wed, Nov 21, 2012 at 9:25 AM, Joe Conway m...@joeconway.com mailto:m...@joeconway.com wrote: On 11/21/2012 08:05 AM, Heikki Linnakangas wrote: Rather than telling the planner what to do or not to do, I'd much rather have hints that give the planner more information about the tables and quals involved in the query. A typical source of bad plans is when the planner gets its cost estimates wrong. So rather than telling the planner to use a nested loop join for a INNER JOIN b ON a.id http://a.id = b.id http://b.id, the user could tell the planner that there are only 10 rows that match the a.id http://a.id = b.id http://b.id qual. That gives the planner the information it needs to choose the right plan on its own. That kind of hints would be much less implementation specific and much more likely to still be useful, or at least not outright counter-productive, in a future version with a smarter planner. You could also attach that kind of hints to tables and columns, which would be more portable and nicer than decorating all queries. I like this idea, but also think that if we have a syntax to allow hints, it would be nice to have a simple way to ignore all hints (yes, I suppose I'm suggesting yet another GUC). That way after sprinkling your SQL with hints, you could easily periodically (e.g. after a Postgres upgrade) test what would happen if the hints were removed. Or a three-way choice: Allow, ignore, or generate an error. That would allow developers to identify where hints are being used. Craig Joe -- Joe Conway credativ LLC: http://www.credativ.us Linux, PostgreSQL, and general Open Source Training, Service, Consulting, 24x7 Support Or perhaps hints should have the pg version attached, so that they are automatically ignored when the pg version changed? Problem may then become people reluctant to upgrade because their hints relate to a previous version! Sigh... Even requiring registration of hints and expiring them after a limited time period would not work - as people would simply automate the process of registration application... Cheers, Gavin
Re: [PERFORM] Hints (was Poor performance using CTE)
On 22.11.2012 02:53, Jeff Janes wrote: That gives the planner the information it needs to choose the right plan on its own. That kind of hints would be much less implementation specific and much more likely to still be useful, or at least not outright counter-productive, in a future version with a smarter planner. When I run into unexpectedly poor performance, I have an intuitive enough feel for my own data that I know what plan it ought to be using. Figuring out why it is not using it is very hard. For one thing, EXPLAIN tells you about the winning plan, but there is no visibility into what ought to be the winning plan but isn't, so no way to see why it isn't.So you first have to use our existing non-hint hints (enable_*, doing weird things with cost_*, CTE stuff) to trick it into using the plan I want it to use, before I can figure out why it isn't using it, before I could figure out what hints of the style you are suggesting to supply to get it to use it. I'm sure that happens too, but my gut feeling is that more often the EXPLAIN ANALYZE output reveals a bad estimate somewhere in the plan, and the planner chooses a bad plan based on the bad estimate. If you hint the planner by giving a better estimate for where the estimator got it wrong, the planner will choose the desired plan. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Hints (was Poor performance using CTE)
On 11/20/2012 08:15 PM, Craig Ringer wrote: I think it's time to admit that and get the syntax in place for CTEs so there's room to optimize them later, rather than cementing CTEs-as-fences in forever as a Pg quirk. I know I'm just some schmo, but I'd vote for this. I'm certainly guilty of using OFFSET 0. Undocumented hints are still hints. As much as I think they're a bad idea by cementing a certain plan that may not get the benefits of future versions, non-intuitive side-effects by using overloaded syntax are worse. I've been using CTEs as temp tables because I know that's how they work. But I'd be more than willing to modify my syntax one way or the other to adopt non-materialized CTEs, provided there's some way to get the current behavior. -- Shaun Thomas OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604 312-444-8534 stho...@optionshouse.com __ See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Hints (was Poor performance using CTE)
Craig Ringer wrote: On 11/21/2012 09:35 AM, Craig James wrote: Why not make an explicit hint syntax and document it? I've still don't understand why hint is a dirty word in Postgres. There are a half-dozen or so ways in common use to circumvent or correct sub-optimal plans. The reason usually given is that hints provide easy workarounds for planner and stats issues, so people don't report problems or fix the underlying problem. Of course, if that's all there was to it, `OFFSET 0` would be made into an error or warning, or ignored and not fenced. The reality is, as you say, that there's a need, because the planner can never be perfect - or rather, if it were nearly perfect, it'd take so long to read the stats and calculate plans that everything would be glacially slow anyway. The planner has to compromise, and so cases will always arise where it needs a little help. I think it's time to admit that and get the syntax in place for CTEs so there's room to optimize them later, rather than cementing CTEs-as-fences in forever as a Pg quirk. It's a tough problem. Disguising and not documenting the available optimizer hints leads to more reports on where the optimizer should be smarter, and has spurred optimizer improvements. And many type of hints would undoubtedly cause people to force what they *think* would be the best plan in many cases where they are wrong, or become wrong as data scales up. But it does seem odd every time I hear people saying that they don't want to eliminate some optimization fence because they find it useful while simultaneously arguing that we don't have or want hints. Having a way to coerce the optimizer from the plan it would take with straightforward coding *is* a hint, and one down-side of hiding the hints inside syntax mostly supported for other reasons is that people who don't know about these clever devices can't do reasonable refactoring of queries for readability without risking performance regressions. Another down-side is that perfectly reasonable queries ported from other databases that use hint syntax for hints run afoul of the secret hints when trying to run queries on PostgreSQL, and get performance potentially orders of magnitude worse than they expect. I'm not sure what the best answer is, but as long as we have hints, but only through OFFSET 0 or CTE usage, that should be documented. Better, IMV, would be to identify what sorts of hints people actually find useful, and use that as the basis for TODO items for optimizer improvement as well as inventing clear ways to specify the desired coercion. I liked the suggestion that a CTE which didn't need to be materialized because of side-effects or multiple references have a keyword. Personally, I think that AS MATERIALIZED x (SELECT ...) would be preferable to AS x (SELECT ... OFFSET 0) as the syntax to specify that. Regarding the above-mentioned benefits we would stand to lose by having clear and documented hints, perhaps we could occasionally solicit input on where people are finding hints useful to get ideas on where we might want to improve the optimizer. As far as worrying about people using hints to force a plan which is sub-optimal -- isn't that getting into nanny mode a bit too much? -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Hints (was Poor performance using CTE)
On 21.11.2012 15:42, Kevin Grittner wrote: Better, IMV, would be to identify what sorts of hints people actually find useful, and use that as the basis for TODO items for optimizer improvement as well as inventing clear ways to specify the desired coercion. I liked the suggestion that a CTE which didn't need to be materialized because of side-effects or multiple references have a keyword. Personally, I think that AS MATERIALIZED x (SELECT ...) would be preferable to AS x (SELECT ... OFFSET 0) as the syntax to specify that. Rather than telling the planner what to do or not to do, I'd much rather have hints that give the planner more information about the tables and quals involved in the query. A typical source of bad plans is when the planner gets its cost estimates wrong. So rather than telling the planner to use a nested loop join for a INNER JOIN b ON a.id = b.id, the user could tell the planner that there are only 10 rows that match the a.id = b.id qual. That gives the planner the information it needs to choose the right plan on its own. That kind of hints would be much less implementation specific and much more likely to still be useful, or at least not outright counter-productive, in a future version with a smarter planner. You could also attach that kind of hints to tables and columns, which would be more portable and nicer than decorating all queries. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Hints (was Poor performance using CTE)
On Wed, Nov 21, 2012 at 5:42 AM, Kevin Grittner kgri...@mail.com wrote: It's a tough problem. Disguising and not documenting the available optimizer hints leads to more reports on where the optimizer should be smarter, and has spurred optimizer improvements. ... Regarding the above-mentioned benefits we would stand to lose by having clear and documented hints, perhaps we could occasionally solicit input on where people are finding hints useful to get ideas on where we might want to improve the optimizer. As far as worrying about people using hints to force a plan which is sub-optimal -- isn't that getting into nanny mode a bit too much? Toward that end, the hint documentation (which is almost always viewed as HTML) could be prefaced by a strong suggestion to post performance questions in this group first, with links to the subscribe page and the how to report performance problems FAQ. The hint documentation could even be minimalistic; suggest to developers that they should post their problematic queries here before resorting to hints. That would give the experts an opportunity to provide the normal advice. The correct hint syntax would be suggested only when all other avenues failed. Craig James -Kevin
Re: [PERFORM] Hints (was Poor performance using CTE)
On 11/21/2012 08:05 AM, Heikki Linnakangas wrote: Rather than telling the planner what to do or not to do, I'd much rather have hints that give the planner more information about the tables and quals involved in the query. A typical source of bad plans is when the planner gets its cost estimates wrong. So rather than telling the planner to use a nested loop join for a INNER JOIN b ON a.id = b.id, the user could tell the planner that there are only 10 rows that match the a.id = b.id qual. That gives the planner the information it needs to choose the right plan on its own. That kind of hints would be much less implementation specific and much more likely to still be useful, or at least not outright counter-productive, in a future version with a smarter planner. You could also attach that kind of hints to tables and columns, which would be more portable and nicer than decorating all queries. I like this idea, but also think that if we have a syntax to allow hints, it would be nice to have a simple way to ignore all hints (yes, I suppose I'm suggesting yet another GUC). That way after sprinkling your SQL with hints, you could easily periodically (e.g. after a Postgres upgrade) test what would happen if the hints were removed. Joe -- Joe Conway credativ LLC: http://www.credativ.us Linux, PostgreSQL, and general Open Source Training, Service, Consulting, 24x7 Support -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Hints (was Poor performance using CTE)
On Wed, Nov 21, 2012 at 9:25 AM, Joe Conway m...@joeconway.com wrote: On 11/21/2012 08:05 AM, Heikki Linnakangas wrote: Rather than telling the planner what to do or not to do, I'd much rather have hints that give the planner more information about the tables and quals involved in the query. A typical source of bad plans is when the planner gets its cost estimates wrong. So rather than telling the planner to use a nested loop join for a INNER JOIN b ON a.id = b.id, the user could tell the planner that there are only 10 rows that match the a.id = b.id qual. That gives the planner the information it needs to choose the right plan on its own. That kind of hints would be much less implementation specific and much more likely to still be useful, or at least not outright counter-productive, in a future version with a smarter planner. You could also attach that kind of hints to tables and columns, which would be more portable and nicer than decorating all queries. I like this idea, but also think that if we have a syntax to allow hints, it would be nice to have a simple way to ignore all hints (yes, I suppose I'm suggesting yet another GUC). That way after sprinkling your SQL with hints, you could easily periodically (e.g. after a Postgres upgrade) test what would happen if the hints were removed. Or a three-way choice: Allow, ignore, or generate an error. That would allow developers to identify where hints are being used. Craig Joe -- Joe Conway credativ LLC: http://www.credativ.us Linux, PostgreSQL, and general Open Source Training, Service, Consulting, 24x7 Support
Re: [PERFORM] Hints (was Poor performance using CTE)
On 11/21/2012 09:28 AM, Craig James wrote: On Wed, Nov 21, 2012 at 9:25 AM, Joe Conway m...@joeconway.com mailto:m...@joeconway.com wrote: On 11/21/2012 08:05 AM, Heikki Linnakangas wrote: Rather than telling the planner what to do or not to do, I'd much rather have hints that give the planner more information about the tables and quals involved in the query. A typical source of bad plans is when the planner gets its cost estimates wrong. So rather than telling the planner to use a nested loop join for a INNER JOIN b ON a.id http://a.id = b.id http://b.id, the user could tell the planner that there are only 10 rows that match the a.id http://a.id = b.id http://b.id qual. That gives the planner the information it needs to choose the right plan on its own. That kind of hints would be much less implementation specific and much more likely to still be useful, or at least not outright counter-productive, in a future version with a smarter planner. You could also attach that kind of hints to tables and columns, which would be more portable and nicer than decorating all queries. I like this idea, but also think that if we have a syntax to allow hints, it would be nice to have a simple way to ignore all hints (yes, I suppose I'm suggesting yet another GUC). That way after sprinkling your SQL with hints, you could easily periodically (e.g. after a Postgres upgrade) test what would happen if the hints were removed. Or a three-way choice: Allow, ignore, or generate an error. That would allow developers to identify where hints are being used. +1 Joe -- Joe Conway credativ LLC: http://www.credativ.us Linux, PostgreSQL, and general Open Source Training, Service, Consulting, 24x7 Support -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Hints (was Poor performance using CTE)
Craig James cja...@emolecules.com writes: On Wed, Nov 21, 2012 at 9:25 AM, Joe Conway m...@joeconway.com wrote: I like this idea, but also think that if we have a syntax to allow hints, it would be nice to have a simple way to ignore all hints (yes, I suppose I'm suggesting yet another GUC). That way after sprinkling your SQL with hints, you could easily periodically (e.g. after a Postgres upgrade) test what would happen if the hints were removed. Or a three-way choice: Allow, ignore, or generate an error. That would allow developers to identify where hints are being used. Throwing errors would likely prevent you from reaching all parts of your application, thus preventing complete testing. Much more sensible to just log such queries. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Hints (was Poor performance using CTE)
Craig James cja...@emolecules.com wrote: On 11/21/2012 08:05 AM, Heikki Linnakangas wrote: Rather than telling the planner what to do or not to do, I'd much rather have hints that give the planner more information about the tables and quals involved in the query. A typical source of bad plans is when the planner gets its cost estimates wrong. So rather than telling the planner to use a nested loop join for a INNER JOIN b ON a.id = b.id, the user could tell the planner that there are only 10 rows that match the a.id = b.id qual. That gives the planner the information it needs to choose the right plan on its own. That kind of hints would be much less implementation specific and much more likely to still be useful, or at least not outright counter-productive, in a future version with a smarter planner. You could also attach that kind of hints to tables and columns, which would be more portable and nicer than decorating all queries. I like this idea, but also think that if we have a syntax to allow hints, it would be nice to have a simple way to ignore all hints (yes, I suppose I'm suggesting yet another GUC). That way after sprinkling your SQL with hints, you could easily periodically (e.g. after a Postgres upgrade) test what would happen if the hints were removed. Or a three-way choice: Allow, ignore, or generate an error. That would allow developers to identify where hints are being used. Craig +1 I think, we HAVE a smart planner, but hints in this direction are okay, and we need a simple way to make such hints obsolete - for/in the future. Andreas -- Really, I'm not out to destroy Microsoft. That will just be a completely unintentional side effect. (Linus Torvalds) If I was god, I would recompile penguin with --enable-fly. (unknown) Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889° -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Hints (was Poor performance using CTE)
On Wed, Nov 21, 2012 at 8:05 AM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 21.11.2012 15:42, Kevin Grittner wrote: Better, IMV, would be to identify what sorts of hints people actually find useful, and use that as the basis for TODO items for optimizer improvement as well as inventing clear ways to specify the desired coercion. I liked the suggestion that a CTE which didn't need to be materialized because of side-effects or multiple references have a keyword. Personally, I think that AS MATERIALIZED x (SELECT ...) would be preferable to AS x (SELECT ... OFFSET 0) as the syntax to specify that. Rather than telling the planner what to do or not to do, I'd much rather have hints that give the planner more information about the tables and quals involved in the query. A typical source of bad plans is when the planner gets its cost estimates wrong. So rather than telling the planner to use a nested loop join for a INNER JOIN b ON a.id = b.id, the user could tell the planner that there are only 10 rows that match the a.id = b.id qual. For each a.id there are 10 b.id, or for each b.id there are 10 a.id? That gives the planner the information it needs to choose the right plan on its own. That kind of hints would be much less implementation specific and much more likely to still be useful, or at least not outright counter-productive, in a future version with a smarter planner. When I run into unexpectedly poor performance, I have an intuitive enough feel for my own data that I know what plan it ought to be using. Figuring out why it is not using it is very hard. For one thing, EXPLAIN tells you about the winning plan, but there is no visibility into what ought to be the winning plan but isn't, so no way to see why it isn't.So you first have to use our existing non-hint hints (enable_*, doing weird things with cost_*, CTE stuff) to trick it into using the plan I want it to use, before I can figure out why it isn't using it, before I could figure out what hints of the style you are suggesting to supply to get it to use it. So I think the type of hints you are suggesting would be about as hard for the user to use as debugging the planner for the particular case would be. While the more traditional type of hint is easy to use, because the end user understands their data more than they understand the guts of the planner. Cheers, Jeff -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Hints (was Poor performance using CTE)
On Tue, Nov 20, 2012 at 3:53 PM, Tom Lane t...@sss.pgh.pa.us wrote: Craig Ringer cr...@2ndquadrant.com writes: On 11/21/2012 12:06 AM, Claudio Freire wrote: I meant for postgres to do automatically. Rewriting as a join wouldn't work as an optimization fence the way we're used to, but pushing constraints upwards can only help (especially if highly selective). Because people are now used to using CTEs as query hints, it'd probably cause performance regressions in working queries. Perhaps more importantly, Pg would have to prove that doing so didn't change queries that invoked functions with side-effects to avoid changing the results of currently valid queries. We could trivially arrange to keep the current semantics if the CTE query contains any volatile functions (or of course if it's INSERT/UPDATE/DELETE). I think we'd also need to not optimize if it's invoked from more than one place in the outer query. I think the more interesting question is what cases wouldn't be covered by such a rule. Typically you need to use OFFSET 0 in situations where the planner has guessed wrong about costs or rowcounts, and I think people are likely using WITH for that as well. Should we be telling people that they ought to insert OFFSET 0 in WITH queries if they want to be sure there's an optimization fence? I'm probably beating a dead horse ... but isn't this just a hint? Except that it's worse than a hint, because it's a hint in disguise and is undocumented. As far as I can tell, there's no use for OFFSET 0 except to act as an optimizer fence. It's clearly an important need, given the nature of the dialog above (and many others that have passed through this mailing list). Why not make an explicit hint syntax and document it? I've still don't understand why hint is a dirty word in Postgres. There are a half-dozen or so ways in common use to circumvent or correct sub-optimal plans. Craig James regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Hints (was Poor performance using CTE)
On 11/21/2012 09:35 AM, Craig James wrote: Why not make an explicit hint syntax and document it? I've still don't understand why hint is a dirty word in Postgres. There are a half-dozen or so ways in common use to circumvent or correct sub-optimal plans. The reason usually given is that hints provide easy workarounds for planner and stats issues, so people don't report problems or fix the underlying problem. Of course, if that's all there was to it, `OFFSET 0` would be made into an error or warning, or ignored and not fenced. The reality is, as you say, that there's a need, because the planner can never be perfect - or rather, if it were nearly perfect, it'd take so long to read the stats and calculate plans that everything would be glacially slow anyway. The planner has to compromise, and so cases will always arise where it needs a little help. I think it's time to admit that and get the syntax in place for CTEs so there's room to optimize them later, rather than cementing CTEs-as-fences in forever as a Pg quirk. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance