Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2015-11-05 Thread Evgeniy Shishkin
Sorry for disrupting the thread, i am wondering will it be possible to use BRIN indexes to better estimate distribution? I mean create btree index and brin index, probe brin during planning and estimate if abort early plan with btree will be better. -- Sent via pgsql-performance mailing

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2015-02-12 Thread Michael Paquier
On Wed, Dec 17, 2014 at 4:55 PM, Simon Riggs si...@2ndquadrant.com wrote: On 12 December 2014 at 03:31, Simon Riggs si...@2ndquadrant.com wrote: Also attached is a new parameter called enable_sortedpath which can be used to turn on/off the sorted path generated by the planner. Now with

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-12-16 Thread Simon Riggs
On 12 December 2014 at 03:31, Simon Riggs si...@2ndquadrant.com wrote: Also attached is a new parameter called enable_sortedpath which can be used to turn on/off the sorted path generated by the planner. Now with attachment. (Thanks Jeff!) -- Simon Riggs

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-12-11 Thread Simon Riggs
On 30 September 2014 at 10:25, Simon Riggs si...@2ndquadrant.com wrote: On 30 September 2014 00:00, Tom Lane t...@sss.pgh.pa.us wrote: The existing cost estimation code effectively assumes that they're perfectly uniformly distributed; which is a good average-case assumption but can be

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-12-11 Thread Simon Riggs
On 12 December 2014 at 03:22, Simon Riggs si...@2ndquadrant.com wrote: It's a simple patch, but it solves the test cases I know about and does almost nothing to planning time. Test cases attached. The files marked pettus_* are written up from Christophe Pettus' blog. The other test case is one

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-12-09 Thread Josh Berkus
On 12/05/2014 08:04 AM, Simon Riggs wrote: On 6 December 2014 at 00:45, Merlin Moncure mmonc...@gmail.com wrote: Neat -- got any test cases (would this have prevented OP's problem)? No test case was posted, so I am unable to confirm. A test case I produced that appears to be the same

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-12-09 Thread Simon Riggs
On 10 December 2014 at 10:46, Josh Berkus j...@agliodbs.com wrote: On 12/05/2014 08:04 AM, Simon Riggs wrote: On 6 December 2014 at 00:45, Merlin Moncure mmonc...@gmail.com wrote: Neat -- got any test cases (would this have prevented OP's problem)? No test case was posted, so I am unable to

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-12-05 Thread Merlin Moncure
On Fri, Dec 5, 2014 at 12:46 AM, Simon Riggs si...@2ndquadrant.com wrote: On 30 September 2014 at 05:53, Simon Riggs si...@2ndquadrant.com wrote: On 29 September 2014 16:00, Merlin Moncure mmonc...@gmail.com wrote: On Fri, Sep 26, 2014 at 3:06 AM, Simon Riggs si...@2ndquadrant.com wrote: The

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-12-05 Thread Simon Riggs
On 6 December 2014 at 00:45, Merlin Moncure mmonc...@gmail.com wrote: Neat -- got any test cases (would this have prevented OP's problem)? No test case was posted, so I am unable to confirm. A test case I produced that appears to be the same issue is fixed. I await confirmation from the OP.

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-12-04 Thread Simon Riggs
On 30 September 2014 at 05:53, Simon Riggs si...@2ndquadrant.com wrote: On 29 September 2014 16:00, Merlin Moncure mmonc...@gmail.com wrote: On Fri, Sep 26, 2014 at 3:06 AM, Simon Riggs si...@2ndquadrant.com wrote: The problem, as I see it, is different. We assume that if there are 100

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-11-23 Thread Tomas Vondra
On 21.11.2014 19:38, Jeff Janes wrote: When I run this patch on the regression database, I get a case where the current method is exact but the adaptive one is off: WARNING: ndistinct estimate current=676.00 adaptive=906.00 select count(distinct stringu1) from onek; 676 It should be

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-11-21 Thread Jeff Janes
On Fri, Oct 10, 2014 at 10:53 AM, Tomas Vondra t...@fuzzy.cz wrote: On 10.10.2014 14:10, Tomas Vondra wrote: Dne 10 Říjen 2014, 13:16, Greg Stark napsal(a): On Thu, Oct 2, 2014 at 8:56 PM, Josh Berkus j...@agliodbs.com wrote: Yes, it's only intractable if you're wedded to the idea of a

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-10-19 Thread Greg Stark
On Sat, Oct 18, 2014 at 6:01 PM, Tomas Vondra t...@fuzzy.cz wrote: Hmmm. I have 0 experience with handling patents and related issues. Any idea how to address that? Well there's no real way to address it. But to summarize: 1) We should not go searching for patents, knowing that something is

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-10-18 Thread Tomas Vondra
On 17.10.2014 19:25, Greg Stark wrote: On Wed, Oct 15, 2014 at 7:02 PM, Tomas Vondra t...@fuzzy.cz wrote: If you know the title of the article, it's usually available elsewhere on the web - either at the university site, or elsewhere. I found these two articles about block-based sampling:

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-10-17 Thread Greg Stark
On Wed, Oct 15, 2014 at 7:02 PM, Tomas Vondra t...@fuzzy.cz wrote: If you know the title of the article, it's usually available elsewhere on the web - either at the university site, or elsewhere. I found these two articles about block-based sampling:

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-10-15 Thread Josh Berkus
On 10/10/2014 04:16 AM, Greg Stark wrote: On Thu, Oct 2, 2014 at 8:56 PM, Josh Berkus j...@agliodbs.com wrote: Yes, it's only intractable if you're wedded to the idea of a tiny, fixed-size sample. If we're allowed to sample, say, 1% of the table, we can get a MUCH more accurate n_distinct

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-10-15 Thread Tomas Vondra
On 15.10.2014 19:20, Josh Berkus wrote: On 10/10/2014 04:16 AM, Greg Stark wrote: On Thu, Oct 2, 2014 at 8:56 PM, Josh Berkus j...@agliodbs.com wrote: Yes, it's only intractable if you're wedded to the idea of a tiny, fixed-size sample. If we're allowed to sample, say, 1% of the table, we

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-10-10 Thread Greg Stark
On Thu, Oct 2, 2014 at 8:56 PM, Josh Berkus j...@agliodbs.com wrote: Yes, it's only intractable if you're wedded to the idea of a tiny, fixed-size sample. If we're allowed to sample, say, 1% of the table, we can get a MUCH more accurate n_distinct estimate using multiple algorithms, of which

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-10-10 Thread Tomas Vondra
Dne 10 Říjen 2014, 13:16, Greg Stark napsal(a): On Thu, Oct 2, 2014 at 8:56 PM, Josh Berkus j...@agliodbs.com wrote: Yes, it's only intractable if you're wedded to the idea of a tiny, fixed-size sample. If we're allowed to sample, say, 1% of the table, we can get a MUCH more accurate

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-10-10 Thread Craig James
On Fri, Oct 10, 2014 at 5:10 AM, Tomas Vondra t...@fuzzy.cz wrote: I've gone looking for papers on this topic but from what I read this isn't so. To get any noticeable improvement you need to read 10-50% of the table and that's effectively the same as reading the entire table -- and it

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-10-10 Thread Tomas Vondra
On 10.10.2014 16:21, Craig James wrote: On Fri, Oct 10, 2014 at 5:10 AM, Tomas Vondra t...@fuzzy.cz mailto:t...@fuzzy.cz wrote: I've gone looking for papers on this topic but from what I read this isn't so. To get any noticeable improvement you need to read 10-50% of the

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-10-10 Thread Tomas Vondra
On 10.10.2014 14:10, Tomas Vondra wrote: Dne 10 Říjen 2014, 13:16, Greg Stark napsal(a): On Thu, Oct 2, 2014 at 8:56 PM, Josh Berkus j...@agliodbs.com wrote: Yes, it's only intractable if you're wedded to the idea of a tiny, fixed-size sample. If we're allowed to sample, say, 1% of the table,

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-10-10 Thread Craig James
On Fri, Oct 10, 2014 at 9:53 AM, Tomas Vondra t...@fuzzy.cz wrote: On 10.10.2014 16:21, Craig James wrote: Our index is for chemical structures. Chemicals are indexed on chemical fragments http://emolecules.com/info/molecular-informatics. A search typically starts with 50-200 indexed

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-10-10 Thread Tomas Vondra
On 10.10.2014 19:59, Craig James wrote: On Fri, Oct 10, 2014 at 9:53 AM, Tomas Vondra t...@fuzzy.cz mailto:t...@fuzzy.cz wrote: On 10.10.2014 16:21, Craig James wrote: Our index is for chemical structures. Chemicals are indexed on chemical fragments

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-10-03 Thread Jeff Janes
On Thu, Oct 2, 2014 at 12:56 PM, Josh Berkus j...@agliodbs.com wrote: On 10/02/2014 02:30 AM, Peter Geoghegan wrote: On Thu, Oct 2, 2014 at 1:19 AM, Simon Riggs si...@2ndquadrant.com wrote: Having read papers on it, I believe the problem is intractable. Coding is not the issue. To anyone:

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-10-03 Thread Tomas Vondra
On 3.10.2014 21:58, Jeff Janes wrote: On Thu, Oct 2, 2014 at 12:56 PM, Josh Berkus j...@agliodbs.com mailto:j...@agliodbs.com wrote: Yes, it's only intractable if you're wedded to the idea of a tiny, fixed-size sample. If we're allowed to sample, say, 1% of the table, we can

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-10-03 Thread Tomas Vondra
On 3.10.2014 02:54, Peter Geoghegan wrote: On Thu, Oct 2, 2014 at 12:56 PM, Josh Berkus j...@agliodbs.com wrote: Yes, it's only intractable if you're wedded to the idea of a tiny, fixed-size sample. If we're allowed to sample, say, 1% of the table, we can get a MUCH more accurate n_distinct

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-10-02 Thread Simon Riggs
On 1 October 2014 19:56, Josh Berkus j...@agliodbs.com wrote: On 09/30/2014 04:01 PM, Simon Riggs wrote: On 30 September 2014 18:28, Jeff Janes jeff.ja...@gmail.com wrote: Anyway, in the particular case I posted fixing n_distinct to realistic numbers (%) fixed the query plan. But wouldn't

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-10-02 Thread Peter Geoghegan
On Thu, Oct 2, 2014 at 1:19 AM, Simon Riggs si...@2ndquadrant.com wrote: I disagree that (1) is not worth fixing just because we've provided users with an API to override the stats. It would unquestionably be better for us to have a better n_distinct estimate in the first place. Further, this

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-10-02 Thread Ryan Johnson
On 29/09/2014 9:00 AM, Merlin Moncure wrote: On Fri, Sep 26, 2014 at 3:06 AM, Simon Riggs si...@2ndquadrant.com wrote: The problem, as I see it, is different. We assume that if there are 100 distinct values and you use LIMIT 1 that you would only need to scan 1% of rows. We assume that the data

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-10-02 Thread Josh Berkus
On 10/02/2014 02:30 AM, Peter Geoghegan wrote: On Thu, Oct 2, 2014 at 1:19 AM, Simon Riggs si...@2ndquadrant.com wrote: Having read papers on it, I believe the problem is intractable. Coding is not the issue. To anyone: please prove me wrong, in detail, with references so it can be coded. I

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-10-02 Thread Peter Geoghegan
On Thu, Oct 2, 2014 at 12:56 PM, Josh Berkus j...@agliodbs.com wrote: Yes, it's only intractable if you're wedded to the idea of a tiny, fixed-size sample. If we're allowed to sample, say, 1% of the table, we can get a MUCH more accurate n_distinct estimate using multiple algorithms, of which

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-10-01 Thread Josh Berkus
On 09/30/2014 04:01 PM, Simon Riggs wrote: On 30 September 2014 18:28, Jeff Janes jeff.ja...@gmail.com wrote: Anyway, in the particular case I posted fixing n_distinct to realistic numbers (%) fixed the query plan. But wouldn't fixing the absolute number also have fixed the plan? There

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-09-30 Thread Simon Riggs
On 29 September 2014 22:54, Josh Berkus j...@agliodbs.com wrote: On 09/26/2014 01:06 AM, Simon Riggs wrote: On 23 September 2014 00:56, Josh Berkus j...@agliodbs.com wrote: We've hashed that out a bit, but frankly I think it's much more profitable to pursue fixing the actual problem than

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-09-30 Thread Simon Riggs
On 30 September 2014 00:00, Tom Lane t...@sss.pgh.pa.us wrote: Simon Riggs si...@2ndquadrant.com writes: The way I'm seeing it, you can't assume the LIMIT will apply to any IndexScan that doesn't have an index condition. If it has just a filter, or nothing at all, just an ordering then it

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-09-30 Thread Graeme B. Bell
The existing cost estimation code effectively assumes that they're perfectly uniformly distributed; which is a good average-case assumption but can be horribly wrong in the worst case. Sorry, just an outsider jumping in with a quick comment. Every year or two the core count goes up.

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-09-30 Thread Claudio Freire
On Tue, Sep 30, 2014 at 8:34 AM, Graeme B. Bell g...@skogoglandskap.no wrote: The existing cost estimation code effectively assumes that they're perfectly uniformly distributed; which is a good average-case assumption but can be horribly wrong in the worst case. Sorry, just an outsider

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-09-30 Thread Tom Lane
Graeme B. Bell g...@skogoglandskap.no writes: Every year or two the core count goes up. Can/should/does postgres ever attempt two strategies in parallel, in cases where strategy A is generally good but strategy B prevents bad worst case behaviour? Kind of like a Schrödinger's Cat approach

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-09-30 Thread Jeff Janes
On Mon, Sep 29, 2014 at 7:12 PM, Gavin Flower gavinflo...@archidevsys.co.nz wrote: Would it be feasible to get a competent statistician to advise what data to collect, and to analyze it? Maybe it is possible to get a better estimate on how much of a table needs to be scanned, based on

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-09-30 Thread Jeff Janes
On Mon, Sep 29, 2014 at 2:54 PM, Josh Berkus j...@agliodbs.com wrote: On 09/26/2014 01:06 AM, Simon Riggs wrote: On 23 September 2014 00:56, Josh Berkus j...@agliodbs.com wrote: We've hashed that out a bit, but frankly I think it's much more profitable to pursue fixing the actual problem

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-09-30 Thread Graeme B. Bell
Thanks for your replies everyone. You can't run two plans and have them both returning rows to the client, That wasn't what I had in mind. I can envisage cases where the worst case behaviour of one plan results in zero rows by the time the alternative plan has generated the complete

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-09-30 Thread Gavin Flower
On 01/10/14 05:54, Jeff Janes wrote: On Mon, Sep 29, 2014 at 7:12 PM, Gavin Flower gavinflo...@archidevsys.co.nz mailto:gavinflo...@archidevsys.co.nz wrote: Would it be feasible to get a competent statistician to advise what data to collect, and to analyze it? Maybe it is possible to

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-09-30 Thread Merlin Moncure
On Tue, Sep 30, 2014 at 11:54 AM, Jeff Janes jeff.ja...@gmail.com wrote: On Mon, Sep 29, 2014 at 7:12 PM, Gavin Flower gavinflo...@archidevsys.co.nz wrote: Would it be feasible to get a competent statistician to advise what data to collect, and to analyze it? Maybe it is possible to get a

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-09-30 Thread Simon Riggs
On 30 September 2014 18:28, Jeff Janes jeff.ja...@gmail.com wrote: Anyway, in the particular case I posted fixing n_distinct to realistic numbers (%) fixed the query plan. But wouldn't fixing the absolute number also have fixed the plan? There are two causes of this issue. 1. Poor

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-09-29 Thread Merlin Moncure
On Fri, Sep 26, 2014 at 3:06 AM, Simon Riggs si...@2ndquadrant.com wrote: The problem, as I see it, is different. We assume that if there are 100 distinct values and you use LIMIT 1 that you would only need to scan 1% of rows. We assume that the data is arranged in the table in a very

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-09-29 Thread Simon Riggs
On 29 September 2014 16:00, Merlin Moncure mmonc...@gmail.com wrote: On Fri, Sep 26, 2014 at 3:06 AM, Simon Riggs si...@2ndquadrant.com wrote: The problem, as I see it, is different. We assume that if there are 100 distinct values and you use LIMIT 1 that you would only need to scan 1% of

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-09-29 Thread Josh Berkus
On 09/26/2014 01:06 AM, Simon Riggs wrote: On 23 September 2014 00:56, Josh Berkus j...@agliodbs.com wrote: We've hashed that out a bit, but frankly I think it's much more profitable to pursue fixing the actual problem than providing a workaround like risk, such as: a) fixing n_distinct

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-09-29 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes: The way I'm seeing it, you can't assume the LIMIT will apply to any IndexScan that doesn't have an index condition. If it has just a filter, or nothing at all, just an ordering then it could easily scan the whole index if the stats are wrong. That

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-09-29 Thread Gavin Flower
On 30/09/14 12:00, Tom Lane wrote: Simon Riggs si...@2ndquadrant.com writes: The way I'm seeing it, you can't assume the LIMIT will apply to any IndexScan that doesn't have an index condition. If it has just a filter, or nothing at all, just an ordering then it could easily scan the whole index

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-09-29 Thread Greg Stark
On Fri, Sep 26, 2014 at 9:06 AM, Simon Riggs si...@2ndquadrant.com wrote: If we can at least agree it is a problem, we can try to move forwards. Well that's a good question. I don't think we do and I think the reason why is because we haven't actually pinned down exactly what is the problem.

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-09-26 Thread Simon Riggs
On 23 September 2014 00:56, Josh Berkus j...@agliodbs.com wrote: We've hashed that out a bit, but frankly I think it's much more profitable to pursue fixing the actual problem than providing a workaround like risk, such as: a) fixing n_distinct estimation b) estimating stacked quals using

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-09-22 Thread Merlin Moncure
On Sat, Sep 20, 2014 at 1:33 PM, Josh Berkus j...@agliodbs.com wrote: For example, we could increase the estimated cost for an abort-early index scan by 10X, to reflect our weak confidence in its correctness. Has any progress been made on the performance farm? The problem with suggestions

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-09-22 Thread Josh Berkus
On 09/22/2014 06:55 AM, Merlin Moncure wrote: Has any progress been made on the performance farm? The problem with suggestions like this (which seem pretty reasonable to me) is that we've got no way of quantifying the downside. Yeah, that's certainly an issue. The problem is that we'd need a

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-09-20 Thread Greg Stark
On 19 Sep 2014 19:40, Josh Berkus j...@agliodbs.com wrote: On 09/19/2014 10:15 AM, Merlin Moncure wrote: On Wed, Sep 17, 2014 at 7:11 PM, Josh Berkus j...@agliodbs.com wrote: This is the core issue with abort-early plans; they depend on our statistics being extremely accurate, which we

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-09-20 Thread Claudio Freire
On Sat, Sep 20, 2014 at 3:38 AM, Greg Stark st...@mit.edu wrote: Is there a canonical case of where 'abort early' plans help? (I'm new to that term -- is it a recent planner innovation...got any handy links?) Yeah, here's an example of the canonical case: Table t1 ( a, b, c ) - b is

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-09-20 Thread Tom Lane
Greg Stark st...@mit.edu writes: On 19 Sep 2014 19:40, Josh Berkus j...@agliodbs.com wrote: Yeah, here's an example of the canonical case: Table t1 ( a, b, c ) - b is low-cardinality - c is high-cardinality - There are separate indexes on both b and c. SELECT a, b, c FROM t1 WHERE b

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-09-20 Thread Josh Berkus
On 09/19/2014 11:38 PM, Greg Stark wrote: On 19 Sep 2014 19:40, Josh Berkus j...@agliodbs.com mailto:j...@agliodbs.com wrote: On 09/19/2014 10:15 AM, Merlin Moncure wrote: On Wed, Sep 17, 2014 at 7:11 PM, Josh Berkus j...@agliodbs.com mailto:j...@agliodbs.com wrote: This is the core

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-09-19 Thread Josh Berkus
On 09/19/2014 10:15 AM, Merlin Moncure wrote: On Wed, Sep 17, 2014 at 7:11 PM, Josh Berkus j...@agliodbs.com wrote: This is the core issue with abort-early plans; they depend on our statistics being extremely accurate, which we know they are not. And if they're wrong, the execution time climbs