On 25/12/2024 16:36, Richard Guo wrote:
On Wed, Dec 25, 2024 at 5:14 PM Richard Guo <guofengli...@gmail.com> wrote:
On Wed, Dec 25, 2024 at 11:34 AM Andrei Lepikhov <lepi...@gmail.com> wrote:
2. It is ok for Vars. But what about expressions? We use equal() in
distinct, MCV and dependencies modules. Do we need to remove nulls
before using extended statistics as a general rule?
AFAIU, the expressions in extended statistics are not decorated with
any nullingrels bits, are they?
I've just realized that there may be other places with similar issues,
not just in estimate_num_groups. For instance,
I'm pleased to see that you've grasped my initially unclear idea. Yeah,
it seems that all types of statistics may be lost because of varnullingrels.
In v16 and later, the nullingrels within the expression "t2.a + t2.b"
prevent it from being matched to the corresponding expression in
extended statistics, forcing us to use DEFAULT_UNK_SEL(0.005).
It seems that we need to strip out the nullingrels bits from
expressions before matching them to extended statistics or
expressional index columns in more places.
I think Tomas Vondra may have a decisive opinion in this place: we have
already discussed some approaches to calculate NULLs generated by RHS of
Left Join shortly.
Maybe we can commit a quick cure like the one provided in your patch,
but we should remember this example - it is not apparent to me how to
estimate a group of clauses in the case when part of Vars has
varnullingrels and part of them - doesn't.
Also, I think this is a good example that an explain analyse summary
could have some sort of extended statistics usage report. It can help to
clearly identify cases when extended statistics don't work, but should.
- something like already implemented in SQL Server.
--
regards, Andrei Lepikhov