On 22/9/2025 18:09, Frédéric Yhuel wrote:
On 9/22/25 15:57, Andrei Lepikhov wrote:
I wonder if we could devise another kind of extended statistic that
would provide these "partitioned statistics" without actually partitioning.I'm not sure I fully understand your case, but SQL Server demonstrates
Andrei Lepikhov writes:
> On 25/9/2025 12:41, Frédéric Yhuel wrote:
>> So, on SQL Server, you can do this:
>> CREATE STATISTICS FooStats ON foo (ackid, crit) WHERE crit = 'WARNING';
> Nice! Thanks for the report. I think the only reason why Postgres
> doesn't have it yet is the computational cos
On 25/9/2025 12:41, Frédéric Yhuel wrote:
So, on SQL Server, you can do this:
CREATE STATISTICS FooStats ON foo (ackid, crit) WHERE crit = 'WARNING';
It would be great to have a similar feature in PostgreSQL.Nice! Thanks for the report. I think the only reason why Postgres
doesn't have it yet i
On 9/23/25 12:43, Andrei Lepikhov wrote:
But is it the same for the 'distinct' statistics? It seems you should
love it - the number of groups in GROUP-BY, DISTINCT, and even HashJoin
should be estimated more precisely, no?
I think it has more potential, and I would love to use this weapon,
On 23/9/2025 15:31, Frédéric Yhuel wrote:
To get back to the topic of partitioned statistics, do you know if SQL
Server is smart enough to handle this case [1] that we discussed last
year? (with filtered statistics)
[1] https://www.postgresql.org/message-id/flat/b860c71a-7cab-4d88-
ad87-8c1f2
On 13/10/2025 16:55, Tom Lane wrote:
Andrei Lepikhov writes:
On 25/9/2025 12:41, Frédéric Yhuel wrote:
So, on SQL Server, you can do this:
CREATE STATISTICS FooStats ON foo (ackid, crit) WHERE crit = 'WARNING';
Nice! Thanks for the report. I think the only reason why Postgres
doesn't have i
On 9/22/25 23:15, Andrei Lepikhov wrote:
I'm not sure I fully understand your case, but SQL Server demonstrates
an interesting approach: they have a WHERE clause attached to
statistics. So, having implemented this, you may separate the whole
range of values inside the table into 'partitions'
On 9/23/25 12:20, Frédéric Yhuel wrote:
On 9/22/25 23:15, Andrei Lepikhov wrote:
I'm not sure I fully understand your case, but SQL Server demonstrates
an interesting approach: they have a WHERE clause attached to
statistics. So, having implemented this, you may separate the whole
range of
On 9/23/25 15:31, Frédéric Yhuel wrote:
To get back to the topic of partitioned statistics, do you know if SQL
Server is smart enough to handle this case [1] that we discussed last
year? (with filtered statistics)
[1] https://www.postgresql.org/message-id/flat/b860c71a-7cab-4d88-
ad87-8c1f
On 23/9/2025 12:20, Frédéric Yhuel wrote:
On 9/22/25 23:15, Andrei Lepikhov wrote:
It may solve at least one issue with the 'dependencies' statistics: a
single number describing the dependency between any two values in the
columns often leads to incorrect estimations, as I see.
For what it's
On 9/22/25 15:57, Andrei Lepikhov wrote:
On 22/9/2025 15:37, Frédéric Yhuel wrote:
I wonder if this is an argument in favour of decoupling the sample
size and the precision of the statistics. Here, we basically want the
sample size to be as big as the table in order to include the few
(NULL
On 22/9/2025 15:37, Frédéric Yhuel wrote:
I wonder if this is an argument in favour of decoupling the sample size
and the precision of the statistics. Here, we basically want the sample
size to be as big as the table in order to include the few (NULL,
WARNING) values.
I also have seen how repea
On 9/20/25 18:51, Tom Lane wrote:
I concluded that maybe I was overthinking this part. We only really
need to check the rowcount estimate, since the indexscan cost estimate
is already okay. And stats_ext.sql seems to have gotten away with
assuming that rowcount estimates are reliably reprodu
=?UTF-8?Q?Fr=C3=A9d=C3=A9ric_Yhuel?= writes:
> On 9/18/25 18:40, Tom Lane wrote:
>> The attached fixes things so it works like it did pre-a391ff3c3.
> Indeed, it works well!
Thanks for testing!
>> I spent some time trying to devise a test case, and was reminded
>> of why I didn't have one befor
On Thu, 18 Sep 2025 12:59:11 -0400
Tom Lane wrote:
> Jehan-Guillaume de Rorthais writes:
> > On a fresh instance from HEAD with its default configuration, it shows:
>
> > Index Scan using foo_s_idx on foo (cost=0.29..8.39 rows=3 width=13)
> > Index Cond: (s(crit, ackid) = true)
>
>
On 9/18/25 18:40, Tom Lane wrote:
The attached fixes things so it works like it did pre-a391ff3c3.
Indeed, it works well!
I spent some time trying to devise a test case, and was reminded
of why I didn't have one before: it's hard to make a case that
will be robust enough to not show diffs
=?UTF-8?Q?Fr=C3=A9d=C3=A9ric_Yhuel?= writes:
> On 9/17/25 16:41, Tom Lane wrote:
>> =?UTF-8?Q?Fr=C3=A9d=C3=A9ric_Yhuel?= writes:
>>> 2) the number of estimated rows is completely off in the second EXPLAIN,
>>> whereas the planner could easily use the statistics of foo_f_idx.
>> Hmm, not sure abo
Jehan-Guillaume de Rorthais writes:
> On a fresh instance from HEAD with its default configuration, it shows:
> Index Scan using foo_s_idx on foo (cost=0.29..8.39 rows=3 width=13)
> Index Cond: (s(crit, ackid) = true)
> It seems statistics shown in "pg_stats" view for function "s()" a
Em qui., 18 de set. de 2025 às 13:40, Tom Lane escreveu:
> I wrote:
> > Sigh ... so the answer is this used to work (since commit 39df0f150)
> > and then I carelessly broke it in commit a391ff3c3. If you try this
> > test case in versions 9.5..11 you get a spot-on rowcount estimate.
> > Serves m
I wrote:
> Sigh ... so the answer is this used to work (since commit 39df0f150)
> and then I carelessly broke it in commit a391ff3c3. If you try this
> test case in versions 9.5..11 you get a spot-on rowcount estimate.
> Serves me right for not having a test case I guess, but I'm astonished
> that
Hi there,
I think this discussion has a nice solution, thank you!
However, while poking around this issue yesterday, we also found something
surprising between estimated rows and costs when using a function. Bellow the
scenario to apply on top of Frederic's one to quickly expose the weirdness:
Thank you Laurenz and Tom! I'm going to quote Tom's email here:
On 9/17/25 16:41, Tom Lane wrote:
=?UTF-8?Q?Fr=C3=A9d=C3=A9ric_Yhuel?= writes:
Hello, in the following, I don't understand why:
1) the expression index isn't used in the first EXPLAIN
The planner doesn't look for multi-clause ma
On 9/17/25 16:57, Frédéric Yhuel wrote:
Yes, Laurenz made a similar suggestion, but the problem is that I'm
mostly interested in the estimated number of output rows... because in
the real query, there's a very bad Hash Join above (the Nested Loop is
*much* faster).
BTW, I've also tested an
=?UTF-8?Q?Fr=C3=A9d=C3=A9ric_Yhuel?= writes:
> Hello, in the following, I don't understand why:
> 1) the expression index isn't used in the first EXPLAIN
The planner doesn't look for multi-clause matches of that sort.
You could apply a little ju-jitsu perhaps:
regression=# EXPLAIN (ANALYZE, SUMM
On Wed, 2025-09-17 at 15:55 +0200, Frédéric Yhuel wrote:
> Hello, in the following, I don't understand why:
>
> 1) the expression index isn't used in the first EXPLAIN
>
> 2) the number of estimated rows is completely off in the second EXPLAIN,
> whereas the planner could easily use the statisti
25 matches
Mail list logo