Re: [HACKERS] More stable query plans via more predictable column statistics

2016-04-04 Thread Shulgin, Oleksandr
On Apr 5, 2016 00:31, "Tom Lane" wrote: > > Alex Shulgin writes: > > On Mon, Apr 4, 2016 at 1:06 AM, Tom Lane wrote: > >> I'm inclined to > >> revert the aspect of 3d3bf62f3 that made us work from "d" (the observed > >> number of

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-04-04 Thread Tom Lane
Alex Shulgin writes: > On Mon, Apr 4, 2016 at 1:06 AM, Tom Lane wrote: >> I'm inclined to >> revert the aspect of 3d3bf62f3 that made us work from "d" (the observed >> number of distinct values in the sample) rather than stadistinct (the >>

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-04-03 Thread Alex Shulgin
On Mon, Apr 4, 2016 at 1:06 AM, Tom Lane wrote: > Alex Shulgin writes: > > On Sun, Apr 3, 2016 at 10:53 PM, Tom Lane wrote: > >> The reason for checking toowide_cnt is that if it's greater than zero, > >> then in fact the track

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-04-03 Thread Tom Lane
Alex Shulgin writes: > On Sun, Apr 3, 2016 at 10:53 PM, Tom Lane wrote: >> The reason for checking toowide_cnt is that if it's greater than zero, >> then in fact the track list does NOT include all values seen, and it's >> flat-out wrong to claim that

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-04-03 Thread Alex Shulgin
On Sun, Apr 3, 2016 at 10:53 PM, Tom Lane wrote: > Alex Shulgin writes: > > This recalled observation can now also explain to me why in the > regression > > you've seen, the short path was not followed: my bet is that stadistinct > > appeared

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-04-03 Thread Tom Lane
Alex Shulgin writes: > This recalled observation can now also explain to me why in the regression > you've seen, the short path was not followed: my bet is that stadistinct > appeared negative. Yes, I think that's right. The table under consideration had just a few live

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-04-03 Thread Alex Shulgin
On Sun, Apr 3, 2016 at 8:24 AM, Alex Shulgin wrote: > > On Sun, Apr 3, 2016 at 7:49 AM, Tom Lane wrote: >> >> Alex Shulgin writes: >> > On Sun, Apr 3, 2016 at 7:18 AM, Tom Lane wrote: >> >> Well, we have to

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-04-03 Thread Alex Shulgin
On Sun, Apr 3, 2016, 18:40 Tom Lane wrote: > Alex Shulgin writes: > > > Well, if it's the only value it will be accepted simply because we are > > checking that special case already and don't even bother to loop through > > the track list. > > That

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-04-03 Thread Tom Lane
Alex Shulgin writes: > On Sun, Apr 3, 2016 at 7:49 AM, Tom Lane wrote: >> If there is only one value, it will have 100% of the samples, so it would >> get included under just about any decision rule (other than "more than >> 100% of this value plus

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-04-03 Thread Alex Shulgin
On Sun, Apr 3, 2016 at 7:49 AM, Tom Lane wrote: > Alex Shulgin writes: > > On Sun, Apr 3, 2016 at 7:18 AM, Tom Lane wrote: > >> Well, we have to do *something* with the last (possibly only) value. > >> Neither "include always" nor

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-04-02 Thread Tom Lane
Alex Shulgin writes: > On Sun, Apr 3, 2016 at 7:18 AM, Tom Lane wrote: >> Well, we have to do *something* with the last (possibly only) value. >> Neither "include always" nor "omit always" seem sane to me. What other >> decision rule do you want

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-04-02 Thread Alex Shulgin
On Sun, Apr 3, 2016 at 7:18 AM, Tom Lane wrote: > Alex Shulgin writes: > > On Sun, Apr 3, 2016 at 3:43 AM, Alex Shulgin > wrote: > >> I'm not sure yet about the 1% rule for the last value, but would also > love > >> to see if

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-04-02 Thread Tom Lane
Alex Shulgin writes: > On Sun, Apr 3, 2016 at 3:43 AM, Alex Shulgin wrote: >> I'm not sure yet about the 1% rule for the last value, but would also love >> to see if we can avoid the arbitrary limit here. What happens with a last >> value which is

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-04-02 Thread Alex Shulgin
On Sun, Apr 3, 2016 at 3:43 AM, Alex Shulgin wrote: > > I'm not sure yet about the 1% rule for the last value, but would also love > to see if we can avoid the arbitrary limit here. What happens with a last > value which is less than 1% popular in the current code

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-04-02 Thread Alex Shulgin
On Sat, Apr 2, 2016 at 8:57 PM, Shulgin, Oleksandr < oleksandr.shul...@zalando.de> wrote: > On Apr 2, 2016 18:38, "Tom Lane" wrote: > >> I did not like the fact that the compute_scalar_stats logic >> would allow absolutely anything into the MCV list once num_hist falls >>

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-04-02 Thread Shulgin, Oleksandr
On Apr 2, 2016 18:38, "Tom Lane" wrote: > > "Shulgin, Oleksandr" writes: > > On Apr 1, 2016 23:14, "Tom Lane" wrote: > >> Haven't looked at 0002 yet. > > > [crosses fingers] hope you'll have a chance to do that before feature

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-04-02 Thread Tom Lane
"Shulgin, Oleksandr" writes: > On Apr 1, 2016 23:14, "Tom Lane" wrote: >> Haven't looked at 0002 yet. > [crosses fingers] hope you'll have a chance to do that before feature > freeze for 9.6 I studied this patch for awhile after rebasing it

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-04-01 Thread Shulgin, Oleksandr
On Apr 1, 2016 23:14, "Tom Lane" wrote: > > "Shulgin, Oleksandr" writes: > > Alright. I'm attaching the latest version of this patch split in two > > parts: the first one is NULLs-related bugfix and the second is the > > "improvement" part,

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-04-01 Thread Tom Lane
"Shulgin, Oleksandr" writes: > Alright. I'm attaching the latest version of this patch split in two > parts: the first one is NULLs-related bugfix and the second is the > "improvement" part, which applies on top of the first one. I've applied the first of these

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-03-29 Thread Shulgin, Oleksandr
On Tue, Mar 29, 2016 at 6:24 PM, Tom Lane wrote: > "Shulgin, Oleksandr" writes: > > I've just seen that this patch doesn't have a reviewer assigned > anymore... > > I took my name off it because I was busy with other things and didn't > want to

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-03-29 Thread Tom Lane
"Shulgin, Oleksandr" writes: > I've just seen that this patch doesn't have a reviewer assigned anymore... I took my name off it because I was busy with other things and didn't want to discourage other people from reviewing it meanwhile. I do hope to get to it

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-03-29 Thread Shulgin, Oleksandr
On Tue, Mar 15, 2016 at 4:47 PM, Shulgin, Oleksandr < oleksandr.shul...@zalando.de> wrote: > On Wed, Mar 9, 2016 at 5:28 PM, Tom Lane wrote: > >> "Shulgin, Oleksandr" writes: >> > Yes, I now recall that my actual concern was that sample_cnt may

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-03-15 Thread Shulgin, Oleksandr
On Wed, Mar 9, 2016 at 5:28 PM, Tom Lane wrote: > "Shulgin, Oleksandr" writes: > > Yes, I now recall that my actual concern was that sample_cnt may > calculate > > to 0 due to the latest condition above, but that also implies track_cnt > == > >

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-03-09 Thread Tom Lane
"Shulgin, Oleksandr" writes: > Yes, I now recall that my actual concern was that sample_cnt may calculate > to 0 due to the latest condition above, but that also implies track_cnt == > 0, and then we have a for loop there which will not run at all due to this, > so I

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-03-09 Thread Shulgin, Oleksandr
On Wed, Mar 9, 2016 at 1:33 PM, Tomas Vondra wrote: > Hi, > > On Wed, 2016-03-09 at 11:23 +0100, Shulgin, Oleksandr wrote: > > On Tue, Mar 8, 2016 at 8:16 PM, Alvaro Herrera > > wrote: > > > > Also, I can't quite figure out why the

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-03-09 Thread Tom Lane
Tomas Vondra writes: > On Wed, 2016-03-09 at 12:02 -0300, Alvaro Herrera wrote: >> Tomas Vondra wrote: >>> FWIW while looking at the code I noticed that we skip wide varlena >>> values but not cstrings. Seems a bit suspicious. >> Uh, can you actually have columns of

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-03-09 Thread Tomas Vondra
On Wed, 2016-03-09 at 12:02 -0300, Alvaro Herrera wrote: > Tomas Vondra wrote: > > > FWIW while looking at the code I noticed that we skip wide varlena > > values but not cstrings. Seems a bit suspicious. > > Uh, can you actually have columns of cstring type? I don't think you > can ... Yeah,

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-03-09 Thread Alvaro Herrera
Tomas Vondra wrote: > FWIW while looking at the code I noticed that we skip wide varlena > values but not cstrings. Seems a bit suspicious. Uh, can you actually have columns of cstring type? I don't think you can ... -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-03-09 Thread Tomas Vondra
Hi, On Wed, 2016-03-09 at 11:23 +0100, Shulgin, Oleksandr wrote: > On Tue, Mar 8, 2016 at 8:16 PM, Alvaro Herrera > wrote: > Shulgin, Oleksandr wrote: > > > Alright. I'm attaching the latest version of this patch > split in two >

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-03-09 Thread Tomas Vondra
Hi, On Wed, 2016-03-09 at 10:58 +0100, Shulgin, Oleksandr wrote: > On Tue, Mar 8, 2016 at 9:10 PM, Joel Jacobson > wrote: > On Wed, Mar 9, 2016 at 1:25 AM, Shulgin, Oleksandr > wrote: > > Thank you for spending your time to

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-03-09 Thread Shulgin, Oleksandr
On Tue, Mar 8, 2016 at 8:16 PM, Alvaro Herrera wrote: > Shulgin, Oleksandr wrote: > > > Alright. I'm attaching the latest version of this patch split in two > > parts: the first one is NULLs-related bugfix and the second is the > > "improvement" part, which applies on

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-03-09 Thread Shulgin, Oleksandr
On Tue, Mar 8, 2016 at 9:10 PM, Joel Jacobson wrote: > On Wed, Mar 9, 2016 at 1:25 AM, Shulgin, Oleksandr > wrote: > > Thank you for spending your time to run these :-) > > n/p, it took like 30 seconds :-) > Great! I'm glad to hear it was as

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-03-08 Thread Tom Lane
Robert Haas writes: > On Wed, Jan 20, 2016 at 5:09 PM, Tom Lane wrote: >> Um, I would like to review it, but I doubt I'll find time before the end >> of the month. > Tom, can you pick this up? Yes, now that I've gotten out from under the pathification

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-03-08 Thread Alvaro Herrera
Shulgin, Oleksandr wrote: > Alright. I'm attaching the latest version of this patch split in two > parts: the first one is NULLs-related bugfix and the second is the > "improvement" part, which applies on top of the first one. I went over patch 0001 and it seems pretty reasonable. It's missing

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-03-08 Thread Joel Jacobson
On Wed, Mar 9, 2016 at 1:25 AM, Shulgin, Oleksandr wrote: > Thank you for spending your time to run these :-) n/p, it took like 30 seconds :-) > I don't want to be asking for too much here, but is there a chance you could > try the effects of the proposed patch on

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-03-08 Thread Robert Haas
On Wed, Jan 20, 2016 at 5:09 PM, Tom Lane wrote: > Alvaro Herrera writes: >> Tom Lane wrote: >>> "Shulgin, Oleksandr" writes: This post summarizes a few weeks of research of ANALYZE statistics distribution on

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-03-08 Thread Shulgin, Oleksandr
On Tue, Mar 8, 2016 at 3:36 PM, Joel Jacobson wrote: > Hi Alex, > > Thanks for excellent research. > Joel, Thank you for spending your time to run these :-) I've ran your queries against Trustly's production database and I can > confirm your findings, the results are

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-03-08 Thread Joel Jacobson
Hi Alex, Thanks for excellent research. I've ran your queries against Trustly's production database and I can confirm your findings, the results are similar: WITH ... SELECT count(1), min(hist_ratio)::real, avg(hist_ratio)::real, max(hist_ratio)::real,

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-03-08 Thread Shulgin, Oleksandr
On Mon, Mar 7, 2016 at 6:02 PM, Jeff Janes wrote: > On Mon, Mar 7, 2016 at 3:17 AM, Shulgin, Oleksandr > wrote: > > > > They might get that different plan when they upgrade to the latest major > > version anyway. Is it set somewhere that

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-03-07 Thread Jeff Janes
On Mon, Mar 7, 2016 at 3:17 AM, Shulgin, Oleksandr wrote: > On Fri, Mar 4, 2016 at 7:27 PM, Robert Haas wrote: >> >> On Thu, Mar 3, 2016 at 2:48 AM, Shulgin, Oleksandr >> wrote: >> > On Wed, Mar 2, 2016 at 7:33

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-03-07 Thread Tomas Vondra
Hi, On Mon, 2016-03-07 at 12:17 +0100, Shulgin, Oleksandr wrote: > On Fri, Mar 4, 2016 at 7:27 PM, Robert Haas > wrote: > On Thu, Mar 3, 2016 at 2:48 AM, Shulgin, Oleksandr > wrote: > > On Wed, Mar 2, 2016 at 7:33 PM,

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-03-07 Thread Shulgin, Oleksandr
On Fri, Mar 4, 2016 at 7:27 PM, Robert Haas wrote: > On Thu, Mar 3, 2016 at 2:48 AM, Shulgin, Oleksandr > wrote: > > On Wed, Mar 2, 2016 at 7:33 PM, Alvaro Herrera > > > wrote: > >> Shulgin, Oleksandr wrote: > >> >

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-03-04 Thread Robert Haas
On Thu, Mar 3, 2016 at 2:48 AM, Shulgin, Oleksandr wrote: > On Wed, Mar 2, 2016 at 7:33 PM, Alvaro Herrera > wrote: >> Shulgin, Oleksandr wrote: >> >> > Alright. I'm attaching the latest version of this patch split in two >> > parts: the

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-03-02 Thread Shulgin, Oleksandr
On Wed, Mar 2, 2016 at 7:33 PM, Alvaro Herrera wrote: > Shulgin, Oleksandr wrote: > > > Alright. I'm attaching the latest version of this patch split in two > > parts: the first one is NULLs-related bugfix and the second is the > > "improvement" part, which applies on

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-03-02 Thread Alvaro Herrera
Shulgin, Oleksandr wrote: > Alright. I'm attaching the latest version of this patch split in two > parts: the first one is NULLs-related bugfix and the second is the > "improvement" part, which applies on top of the first one. So is this null-related bugfix supposed to be backpatched? (I

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-03-02 Thread Shulgin, Oleksandr
On Wed, Mar 2, 2016 at 5:46 PM, David Steele wrote: > On 3/2/16 11:10 AM, Shulgin, Oleksandr wrote: > > On Wed, Feb 24, 2016 at 12:30 AM, Tomas Vondra > > > > wrote: > > > > I think it'd be useful not to

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-03-02 Thread David Steele
On 3/2/16 11:10 AM, Shulgin, Oleksandr wrote: > On Wed, Feb 24, 2016 at 12:30 AM, Tomas Vondra > > wrote: > > I think it'd be useful not to have all the changes in one lump, but > structure this as a patch series with

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-03-02 Thread Shulgin, Oleksandr
On Wed, Feb 24, 2016 at 12:30 AM, Tomas Vondra wrote: > Hi, > > On 02/08/2016 03:01 PM, Shulgin, Oleksandr wrote: > > > ... > >> >> I've incorporated this fix into the v2 of my patch, I think it is >> related closely enough. Also, added corresponding changes to >>

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-02-23 Thread Tomas Vondra
Hi, On 02/08/2016 03:01 PM, Shulgin, Oleksandr wrote: > ... I've incorporated this fix into the v2 of my patch, I think it is related closely enough. Also, added corresponding changes to compute_distinct_stats(), which doesn't produce a histogram. I think it'd be useful not to have all the

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-02-08 Thread Shulgin, Oleksandr
On Mon, Jan 25, 2016 at 5:11 PM, Shulgin, Oleksandr < oleksandr.shul...@zalando.de> wrote: > > On Sat, Jan 23, 2016 at 11:22 AM, Tomas Vondra < tomas.von...@2ndquadrant.com> wrote: >> >> >> Overall, I think this is really about deciding when to cut-off the MCV, so that it does not grow needlessly

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-01-25 Thread Shulgin, Oleksandr
On Sat, Jan 23, 2016 at 11:22 AM, Tomas Vondra wrote: > Hi, > > On 01/20/2016 10:49 PM, Alvaro Herrera wrote: > >> >> Tom, are you reviewing this for the current commitfest? >> > > While I'm not the right Tom, I've been looking the the patch recently, so > let me

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-01-23 Thread Tomas Vondra
Hi, On 01/20/2016 10:49 PM, Alvaro Herrera wrote: Tom Lane wrote: "Shulgin, Oleksandr" writes: This post summarizes a few weeks of research of ANALYZE statistics distribution on one of our bigger production databases with some real-world data and proposes a

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-01-20 Thread Tom Lane
Alvaro Herrera writes: > Tom Lane wrote: >> "Shulgin, Oleksandr" writes: >>> This post summarizes a few weeks of research of ANALYZE statistics >>> distribution on one of our bigger production databases with some real-world >>> data and

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-01-20 Thread Alvaro Herrera
Tom Lane wrote: > "Shulgin, Oleksandr" writes: > > This post summarizes a few weeks of research of ANALYZE statistics > > distribution on one of our bigger production databases with some real-world > > data and proposes a patch to rectify some of the oddities

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-01-18 Thread Shulgin, Oleksandr
On Wed, Dec 2, 2015 at 10:20 AM, Shulgin, Oleksandr < oleksandr.shul...@zalando.de> wrote: > On Tue, Dec 1, 2015 at 7:00 PM, Tom Lane wrote: > >> "Shulgin, Oleksandr" writes: >> > This post summarizes a few weeks of research of ANALYZE

Re: [HACKERS] More stable query plans via more predictable column statistics

2015-12-08 Thread Robert Haas
On Fri, Dec 4, 2015 at 12:53 PM, Tom Lane wrote: > Robert Haas writes: >> Still, maybe we should try to sneak at least this much into >> 9.5 RSN, because I have to think this is going to help people with >> mostly-NULL (or mostly-really-wide) columns. >

Re: [HACKERS] More stable query plans via more predictable column statistics

2015-12-07 Thread Shulgin, Oleksandr
On Fri, Dec 4, 2015 at 6:48 PM, Robert Haas wrote: > On Tue, Dec 1, 2015 at 10:21 AM, Shulgin, Oleksandr > wrote: > > > > What I have found is that in a significant percentage of instances, when > a > > duplicate sample value is *not* put

Re: [HACKERS] More stable query plans via more predictable column statistics

2015-12-04 Thread Tom Lane
Robert Haas writes: > Still, maybe we should try to sneak at least this much into > 9.5 RSN, because I have to think this is going to help people with > mostly-NULL (or mostly-really-wide) columns. Please no. We are trying to get to release, not destabilize things. I

Re: [HACKERS] More stable query plans via more predictable column statistics

2015-12-04 Thread Robert Haas
On Tue, Dec 1, 2015 at 10:21 AM, Shulgin, Oleksandr wrote: > Hi Hackers! > > This post summarizes a few weeks of research of ANALYZE statistics > distribution on one of our bigger production databases with some real-world > data and proposes a patch to rectify some

Re: [HACKERS] More stable query plans via more predictable column statistics

2015-12-02 Thread Shulgin, Oleksandr
On Tue, Dec 1, 2015 at 7:00 PM, Tom Lane wrote: > "Shulgin, Oleksandr" writes: > > This post summarizes a few weeks of research of ANALYZE statistics > > distribution on one of our bigger production databases with some > real-world > > data and

Re: [HACKERS] More stable query plans via more predictable column statistics

2015-12-01 Thread Tom Lane
"Shulgin, Oleksandr" writes: > This post summarizes a few weeks of research of ANALYZE statistics > distribution on one of our bigger production databases with some real-world > data and proposes a patch to rectify some of the oddities observed. Please add this to

[HACKERS] More stable query plans via more predictable column statistics

2015-12-01 Thread Shulgin, Oleksandr
Hi Hackers! This post summarizes a few weeks of research of ANALYZE statistics distribution on one of our bigger production databases with some real-world data and proposes a patch to rectify some of the oddities observed. Introduction We have observed that for certain data sets