On 23/9/2025 12:20, Frédéric Yhuel wrote:
On 9/22/25 23:15, Andrei Lepikhov wrote:
It may solve at least one issue with the 'dependencies' statistics: a
single number describing the dependency between any two values in the
columns often leads to incorrect estimations, as I see.
For what it's worth, I've never encountered a case in my life as a
PostgreSQL support engineer where the 'dependency' kind could be useful.
I only successfully used the 'mcv' kind once (and that was only
partially successful, as it fixed the estimates but not the plan).Thanks for your feedback!
I also don't think the 'dependencies' statistics are very useful now,
especially considering how many computational resources it is needed in
case of multiple columns involved.
But is it the same for the 'distinct' statistics? It seems you should
love it - the number of groups in GROUP-BY, DISTINCT, and even HashJoin
should be estimated more precisely, no?
--
regards, Andrei Lepikhov