Fwd: Merging statistics from children instead of re-sampling everything

2022-08-15 Thread Damir Belyalov
> > 3) stadistinct - This is quite problematic. We only have the per-child > estimates, and it's not clear if there's any overlap. For now I've just > summed it up, because that's safer / similar to what we do for gather > merge paths etc. Maybe we could improve this by estimating the overlap > som

Re: Merging statistics from children instead of re-sampling everything

2022-02-18 Thread Andrey V. Lepikhov
On 2/14/22 20:16, Tomas Vondra wrote: On 2/14/22 11:22, Andrey V. Lepikhov wrote: On 2/11/22 20:12, Tomas Vondra wrote: On 2/11/22 05:29, Andrey V. Lepikhov wrote: On 2/11/22 03:37, Tomas Vondra wrote: That being said, this thread was not really about foreign partitions, but about re-anal

Re: Merging statistics from children instead of re-sampling everything

2022-02-14 Thread Tomas Vondra
On 2/14/22 11:22, Andrey V. Lepikhov wrote: On 2/11/22 20:12, Tomas Vondra wrote: On 2/11/22 05:29, Andrey V. Lepikhov wrote: On 2/11/22 03:37, Tomas Vondra wrote: That being said, this thread was not really about foreign partitions, but about re-analyzing inheritance trees in general. An

Re: Merging statistics from children instead of re-sampling everything

2022-02-14 Thread Andrey V. Lepikhov
On 2/11/22 20:12, Tomas Vondra wrote: On 2/11/22 05:29, Andrey V. Lepikhov wrote: On 2/11/22 03:37, Tomas Vondra wrote: That being said, this thread was not really about foreign partitions, but about re-analyzing inheritance trees in general. And sampling foreign partitions doesn't really sol

Re: Merging statistics from children instead of re-sampling everything

2022-02-11 Thread Robert Haas
On Wed, Jun 30, 2021 at 11:15 AM Tomas Vondra wrote: > You're right maintaining a per-partition samples and merging those might > solve (or at least reduce) some of the problems, e.g. eliminating most > of the I/O that'd be needed for sampling. And yeah, it's not entirely > clear how to merge some

Re: Merging statistics from children instead of re-sampling everything

2022-02-11 Thread Tomas Vondra
On 2/11/22 05:29, Andrey V. Lepikhov wrote: On 2/11/22 03:37, Tomas Vondra wrote: That being said, this thread was not really about foreign partitions, but about re-analyzing inheritance trees in general. And sampling foreign partitions doesn't really solve that - we'll still do the sampling

Re: Merging statistics from children instead of re-sampling everything

2022-02-10 Thread Andrey V. Lepikhov
On 2/11/22 03:37, Tomas Vondra wrote: That being said, this thread was not really about foreign partitions, but about re-analyzing inheritance trees in general. And sampling foreign partitions doesn't really solve that - we'll still do the sampling over and over. IMO, to solve the problem we sho

Re: Merging statistics from children instead of re-sampling everything

2022-02-10 Thread Tomas Vondra
On 2/10/22 12:50, Andrey Lepikhov wrote: > On 21/1/2022 01:25, Tomas Vondra wrote: >> But I don't have a very good idea what to do about statistics that we >> can't really merge. For some types of statistics it's rather tricky to >> reasonably merge the results - ndistinct is a simple example, a

Re: Merging statistics from children instead of re-sampling everything

2022-02-10 Thread Andrey Lepikhov
On 21/1/2022 01:25, Tomas Vondra wrote: But I don't have a very good idea what to do about statistics that we can't really merge. For some types of statistics it's rather tricky to reasonably merge the results - ndistinct is a simple example, although we could work around that by building and mer

Re: Merging statistics from children instead of re-sampling everything

2022-01-20 Thread Tomas Vondra
On 6/30/21 17:15, Tomas Vondra wrote: > On 6/30/21 2:55 PM, Andrey Lepikhov wrote: >> Sorry, I forgot to send CC into pgsql-hackers. >> On 29/6/21 13:23, Tomas Vondra wrote: >>> Because sampling is fairly expensive, especially if you have to do it >>> for large number of child relations. And you'd

Re: Merging statistics from children instead of re-sampling everything

2021-06-30 Thread Tomas Vondra
On 6/30/21 2:55 PM, Andrey Lepikhov wrote: Sorry, I forgot to send CC into pgsql-hackers. On 29/6/21 13:23, Tomas Vondra wrote: Because sampling is fairly expensive, especially if you have to do it for large number of child relations. And you'd have to do that every time *any* child triggers au

Re: Merging statistics from children instead of re-sampling everything

2021-06-30 Thread Andrey Lepikhov
Sorry, I forgot to send CC into pgsql-hackers. On 29/6/21 13:23, Tomas Vondra wrote: Because sampling is fairly expensive, especially if you have to do it for large number of child relations. And you'd have to do that every time *any* child triggers autovacuum, pretty much. Merging the stats is

Re: Merging statistics from children instead of re-sampling everything

2021-03-29 Thread Tomas Vondra
Hi, I'd like to point out two less obvious things, about how this relates to Tom's proposal [1] and patch [2] from 2015. Tom approached the problem from a different direction, essentially allowing Var to be associated with a list of statistics instead of just one. So it's a somewhat orthogonal so

Re: Merging statistics from children instead of re-sampling everything

2021-03-29 Thread Tomas Vondra
On 3/29/21 9:24 PM, Tomas Vondra wrote: > > > On 3/29/21 8:36 PM, Justin Pryzby wrote: >> Thanks for taking a fresh look at this. >> >> As you've written it, this can apply to either/both partitioned or >> inheritence. >> I imagine when "MERGE" goes away, this should apply only to partitioned

Re: Merging statistics from children instead of re-sampling everything

2021-03-29 Thread Tomas Vondra
On 3/29/21 8:36 PM, Justin Pryzby wrote: > Thanks for taking a fresh look at this. > > As you've written it, this can apply to either/both partitioned or > inheritence. > I imagine when "MERGE" goes away, this should apply only to partitioned > tables. > (Actually, personally I would advocate

Re: Merging statistics from children instead of re-sampling everything

2021-03-29 Thread Justin Pryzby
Thanks for taking a fresh look at this. As you've written it, this can apply to either/both partitioned or inheritence. I imagine when "MERGE" goes away, this should apply only to partitioned tables. (Actually, personally I would advocate to consider applying it to *both*, but I don't think that's

Merging statistics from children instead of re-sampling everything

2021-03-29 Thread Tomas Vondra
Hi, While reviewing the thread about issues with auto-analyze on partitioned tables [1] I remembered that the way we build statistics on the parent is somewhat expensive, because it samples rows from the children again. It's true we sample much smaller amounts of rows from each partition (proport