Re: Additional improvements to extended statistics

2020-12-08 Thread Tomas Vondra
On 12/7/20 5:15 PM, Dean Rasheed wrote: > On Wed, 2 Dec 2020 at 15:51, Dean Rasheed wrote: >> >> The sort of queries I had in mind were things like this: >> >> WHERE (a = 1 AND b = 1) OR (a = 2 AND b = 2) >> >> However, the new code doesn't apply the extended stats directly using >>

Re: Additional improvements to extended statistics

2020-12-07 Thread Dean Rasheed
On Wed, 2 Dec 2020 at 15:51, Dean Rasheed wrote: > > The sort of queries I had in mind were things like this: > > WHERE (a = 1 AND b = 1) OR (a = 2 AND b = 2) > > However, the new code doesn't apply the extended stats directly using > clauselist_selectivity_or() for this kind of query because

Re: Additional improvements to extended statistics

2020-12-03 Thread Dean Rasheed
On Wed, 2 Dec 2020 at 16:34, Tomas Vondra wrote: > > On 12/2/20 4:51 PM, Dean Rasheed wrote: > > > > Barring any further comments, I'll push this sometime soon. > > +1 > Pushed. Regards, Dean

Re: Additional improvements to extended statistics

2020-12-02 Thread Tomas Vondra
On 12/2/20 4:51 PM, Dean Rasheed wrote: > On Sun, 29 Nov 2020 at 21:02, Tomas Vondra > wrote: >> >> I wonder how much of the comment before clauselist_selectivity should >> move to clauselist_selectivity_ext - it does talk about range clauses >> and so on, but clauselist_selectivity does not

Re: Additional improvements to extended statistics

2020-12-02 Thread Dean Rasheed
On Sun, 29 Nov 2020 at 21:02, Tomas Vondra wrote: > > I wonder how much of the comment before clauselist_selectivity should > move to clauselist_selectivity_ext - it does talk about range clauses > and so on, but clauselist_selectivity does not really deal with that. > But maybe that's just an

Re: Additional improvements to extended statistics

2020-12-01 Thread Tomas Vondra
On 12/1/20 9:15 AM, Dean Rasheed wrote: > On Sun, 29 Nov 2020 at 21:02, Tomas Vondra > wrote: >> >> Those are fairly minor issues. I don't have any deeper objections, and >> it seems committable. Do you plan to do that sometime soon? >> > > OK, I've updated the patch status in the CF app, and I

Re: Additional improvements to extended statistics

2020-12-01 Thread Dean Rasheed
On Sun, 29 Nov 2020 at 21:02, Tomas Vondra wrote: > > Those are fairly minor issues. I don't have any deeper objections, and > it seems committable. Do you plan to do that sometime soon? > OK, I've updated the patch status in the CF app, and I should be able to push it in the next day or so.

Re: Additional improvements to extended statistics

2020-11-29 Thread Tomas Vondra
On 11/29/20 3:57 PM, Dean Rasheed wrote: >> On Wed, 18 Nov 2020 at 22:37, Tomas Vondra >> wrote: >>> >>> Seems fine to me, although the "_opt_ext_stats" is rather cryptic. >>> AFAICS we use "_internal" for similar functions. >>> > > I have been thinking about this some more. The one part of

Re: Additional improvements to extended statistics

2020-11-19 Thread Dean Rasheed
On Wed, 18 Nov 2020 at 22:37, Tomas Vondra wrote: > > Seems fine to me, although the "_opt_ext_stats" is rather cryptic. > AFAICS we use "_internal" for similar functions. > There's precedent for using "_opt_xxx" for function variants that add an option to existing functions, but I agree that in

Re: Additional improvements to extended statistics

2020-11-18 Thread Tomas Vondra
On 11/17/20 4:35 PM, Dean Rasheed wrote: > On Thu, 12 Nov 2020 at 14:18, Tomas Vondra > wrote: >> >> Here is an improved WIP version of the patch series, modified to address >> the issue with repeatedly applying the extended statistics, as discussed >> with Dean in this thread. It's a bit

Re: Additional improvements to extended statistics

2020-11-17 Thread Dean Rasheed
On Thu, 12 Nov 2020 at 14:18, Tomas Vondra wrote: > > Here is an improved WIP version of the patch series, modified to address > the issue with repeatedly applying the extended statistics, as discussed > with Dean in this thread. It's a bit rough and not committable, but I > need some feedback so

Re: Additional improvements to extended statistics

2020-11-12 Thread Dean Rasheed
On Thu, 12 Nov 2020 at 14:18, Tomas Vondra wrote: > > Here is an improved WIP version of the patch series, modified to address > the issue with repeatedly applying the extended statistics, as discussed > with Dean in this thread. It's a bit rough and not committable, but I > need some feedback so

Re: Additional improvements to extended statistics

2020-11-12 Thread Tomas Vondra
Hi, Here is an improved WIP version of the patch series, modified to address the issue with repeatedly applying the extended statistics, as discussed with Dean in this thread. It's a bit rough and not committable, but I need some feedback so I'm posting it in this state. (Note: The WIP patch is

Re: Additional improvements to extended statistics

2020-07-02 Thread Tomas Vondra
On Wed, Jul 01, 2020 at 01:19:40PM +0200, Daniel Gustafsson wrote: On 24 Mar 2020, at 15:33, Tomas Vondra wrote: On Tue, Mar 24, 2020 at 01:20:07PM +, Dean Rasheed wrote: Sounds like a reasonable approach, but I think it would be better to preserve the current public API by having

Re: Additional improvements to extended statistics

2020-07-01 Thread Daniel Gustafsson
> On 24 Mar 2020, at 15:33, Tomas Vondra wrote: > > On Tue, Mar 24, 2020 at 01:20:07PM +, Dean Rasheed wrote: >> Sounds like a reasonable approach, but I think it would be better to >> preserve the current public API by having clauselist_selectivity() >> become a thin wrapper around a new

Re: Additional improvements to extended statistics

2020-03-24 Thread Thomas Munro
On Sun, Mar 15, 2020 at 3:23 PM Tomas Vondra wrote: > On Sun, Mar 15, 2020 at 02:48:02PM +1300, Thomas Munro wrote: > >Stimulated by some bad plans involving JSON, I found my way to your > >WIP stats-on-expressions patch in this thread. Do I understand > >correctly that it will eventually also

Re: Additional improvements to extended statistics

2020-03-24 Thread Tomas Vondra
On Tue, Mar 24, 2020 at 01:20:07PM +, Dean Rasheed wrote: On Tue, 24 Mar 2020 at 01:08, Tomas Vondra wrote: Hmmm. So let's consider a simple OR clause with two arguments, both covered by single statistics object. Something like this: CREATE TABLE t (a int, b int); INSERT INTO t

Re: Additional improvements to extended statistics

2020-03-24 Thread Dean Rasheed
On Tue, 24 Mar 2020 at 01:08, Tomas Vondra wrote: > > Hmmm. So let's consider a simple OR clause with two arguments, both > covered by single statistics object. Something like this: > >CREATE TABLE t (a int, b int); >INSERT INTO t SELECT mod(i, 10), mod(i, 10) > FROM

Re: Additional improvements to extended statistics

2020-03-23 Thread Tomas Vondra
On Mon, Mar 23, 2020 at 08:21:42AM +, Dean Rasheed wrote: On Sat, 21 Mar 2020 at 21:59, Tomas Vondra wrote: Ah, right. Yeah, I think that should work. I thought there would be some volatility due to groups randomly not making it into the MCV list, but you're right it's possible to

Re: Additional improvements to extended statistics

2020-03-23 Thread Dean Rasheed
On Sat, 21 Mar 2020 at 21:59, Tomas Vondra wrote: > > Ah, right. Yeah, I think that should work. I thought there would be some > volatility due to groups randomly not making it into the MCV list, but > you're right it's possible to construct the data in a way to make it > perfectly deterministic.

Re: Additional improvements to extended statistics

2020-03-21 Thread Tomas Vondra
On Thu, Mar 19, 2020 at 07:08:07PM +, Dean Rasheed wrote: On Wed, 18 Mar 2020 at 19:31, Tomas Vondra wrote: Attached is a rebased patch series, addressing both those issues. I've been wondering why none of the regression tests failed because of the 0.0 vs. 1.0 issue, but I think the

Re: Additional improvements to extended statistics

2020-03-19 Thread Dean Rasheed
On Wed, 18 Mar 2020 at 19:31, Tomas Vondra wrote: > > Attached is a rebased patch series, addressing both those issues. > > I've been wondering why none of the regression tests failed because of > the 0.0 vs. 1.0 issue, but I think the explanation is pretty simple - to > make the tests stable,

Re: Additional improvements to extended statistics

2020-03-18 Thread Tomas Vondra
On Sun, Mar 15, 2020 at 12:37:37PM +, Dean Rasheed wrote: On Sun, 15 Mar 2020 at 00:08, Tomas Vondra wrote: On Sat, Mar 14, 2020 at 05:56:10PM +0100, Tomas Vondra wrote: > >Attached is a patch series rebased on top of the current master, after >committing the ScalarArrayOpExpr

Re: Additional improvements to extended statistics

2020-03-15 Thread Dean Rasheed
On Sun, 15 Mar 2020 at 00:08, Tomas Vondra wrote: > > On Sat, Mar 14, 2020 at 05:56:10PM +0100, Tomas Vondra wrote: > > > >Attached is a patch series rebased on top of the current master, after > >committing the ScalarArrayOpExpr enhancements. I've updated the OR patch > >to get rid of the code

Re: Additional improvements to extended statistics

2020-03-14 Thread Tomas Vondra
On Sun, Mar 15, 2020 at 02:48:02PM +1300, Thomas Munro wrote: On Sun, Mar 15, 2020 at 1:08 PM Tomas Vondra wrote: On Sat, Mar 14, 2020 at 05:56:10PM +0100, Tomas Vondra wrote: >Attached is a patch series rebased on top of the current master, after >committing the ScalarArrayOpExpr

Re: Additional improvements to extended statistics

2020-03-14 Thread Thomas Munro
On Sun, Mar 15, 2020 at 1:08 PM Tomas Vondra wrote: > On Sat, Mar 14, 2020 at 05:56:10PM +0100, Tomas Vondra wrote: > >Attached is a patch series rebased on top of the current master, after > >committing the ScalarArrayOpExpr enhancements. I've updated the OR patch > >to get rid of the code

Re: Additional improvements to extended statistics

2020-03-14 Thread Tomas Vondra
On Sat, Mar 14, 2020 at 05:56:10PM +0100, Tomas Vondra wrote: ... Attached is a patch series rebased on top of the current master, after committing the ScalarArrayOpExpr enhancements. I've updated the OR patch to get rid of the code duplication, and barring objections I'll get it committed

Re: Additional improvements to extended statistics

2020-03-14 Thread Tomas Vondra
On Fri, Mar 13, 2020 at 04:54:51PM +, Dean Rasheed wrote: On Mon, 9 Mar 2020 at 00:06, Tomas Vondra wrote: On Mon, Mar 09, 2020 at 01:01:57AM +0100, Tomas Vondra wrote: > >Attaches is an updated patch series >with parts 0002 and 0003 adding tests demonstrating the issue and then >fixing

Re: Additional improvements to extended statistics

2020-03-13 Thread Dean Rasheed
On Mon, 9 Mar 2020 at 00:06, Tomas Vondra wrote: > > On Mon, Mar 09, 2020 at 01:01:57AM +0100, Tomas Vondra wrote: > > > >Attaches is an updated patch series > >with parts 0002 and 0003 adding tests demonstrating the issue and then > >fixing it (both shall be merged to 0001). > > > > One day I

Re: Additional improvements to extended statistics

2020-03-11 Thread Dean Rasheed
On Mon, 9 Mar 2020 at 18:19, Tomas Vondra wrote:> > On Mon, Mar 09, 2020 at 08:35:48AM +, Dean Rasheed wrote: > > > > P(a,b) = P(a) * [f + (1-f)*P(b)] > > > >because it might return a value that is larger that P(b), which > >obviously should not be possible. > > Hmmm, yeah. It took me a

Re: Additional improvements to extended statistics

2020-03-09 Thread Tomas Vondra
On Mon, Mar 09, 2020 at 08:35:48AM +, Dean Rasheed wrote: On Mon, 9 Mar 2020 at 00:02, Tomas Vondra wrote: Speaking of which, would you take a look at [1]? I think supporting SAOP is fine, but I wonder if you agree with my conclusion we can't really support inclusion @> as explained in

Re: Additional improvements to extended statistics

2020-03-09 Thread Dean Rasheed
On Mon, 9 Mar 2020 at 00:02, Tomas Vondra wrote: > > Speaking of which, would you take a look at [1]? I think supporting SAOP > is fine, but I wonder if you agree with my conclusion we can't really > support inclusion @> as explained in [2]. > Hmm, I'm not sure. However, thinking about your

Re: Additional improvements to extended statistics

2020-03-08 Thread Tomas Vondra
On Mon, Mar 09, 2020 at 01:01:57AM +0100, Tomas Vondra wrote: On Sun, Mar 08, 2020 at 07:17:10PM +, Dean Rasheed wrote: On Fri, 6 Mar 2020 at 12:58, Tomas Vondra wrote: Here is a rebased version of this patch series. I've polished the first two parts a bit - estimation of OR clauses and

Re: Additional improvements to extended statistics

2020-03-08 Thread Tomas Vondra
On Sun, Mar 08, 2020 at 07:17:10PM +, Dean Rasheed wrote: On Fri, 6 Mar 2020 at 12:58, Tomas Vondra wrote: Here is a rebased version of this patch series. I've polished the first two parts a bit - estimation of OR clauses and (Var op Var) clauses. Hi, I've been looking over the first

Re: Additional improvements to extended statistics

2020-03-08 Thread Dean Rasheed
On Fri, 6 Mar 2020 at 12:58, Tomas Vondra wrote: > > Here is a rebased version of this patch series. I've polished the first > two parts a bit - estimation of OR clauses and (Var op Var) clauses. > Hi, I've been looking over the first patch (OR list support). It mostly looks reasonable to me,

Re: Additional improvements to extended statistics

2020-03-06 Thread Tomas Vondra
On Fri, Mar 06, 2020 at 01:15:56AM +0100, Tomas Vondra wrote: Hi, Here is a rebased version of this patch series. I've polished the first two parts a bit - estimation of OR clauses and (Var op Var) clauses, and added a bunch of regression tests to exercise this code. It's not quite there yet,

Re: Additional improvements to extended statistics

2020-03-05 Thread Tomas Vondra
Hi, Here is a rebased version of this patch series. I've polished the first two parts a bit - estimation of OR clauses and (Var op Var) clauses, and added a bunch of regression tests to exercise this code. It's not quite there yet, but I think it's feasible to get this committed for PG13. The

Re: Additional improvements to extended statistics

2020-01-14 Thread Pavel Stehule
Ășt 14. 1. 2020 v 0:00 odesĂ­latel Tomas Vondra napsal: > Hi, > > Now that I've committed [1] which allows us to use multiple extended > statistics per table, I'd like to start a thread discussing a couple of > additional improvements for extended statistics. I've considered > s

Additional improvements to extended statistics

2020-01-13 Thread Tomas Vondra
Hi, Now that I've committed [1] which allows us to use multiple extended statistics per table, I'd like to start a thread discussing a couple of additional improvements for extended statistics. I've considered starting a separate patch for each, but that would be messy as those changes