I would agree if the FK was independently on columns a and b, but it isn't.
It is on (a,b). Since we are joining on the FK fields, which are also the PK
of the parent, we have:
NDVp = rows(parent)
and
NDVc = NDV(a,b) != NDV(a) * NDV(b)
and
NDVc <= NDVp
so
max(NDVp,NDVc) = NDVp = rows(parent)
and
j
Ah, but that cardinality underestimation has nothing to do with the join or
the foreign key relationship, but solely with the other fallacy of the cbo
- the predicate independence assumption. In your example, the predicates a
and b are completely dependent; once you choose one, the other is
det
Wolfgang,
Thanks for the response. The problem I am seeing is slightly different.
(I'll try to post some more detailed data, when I have the time). It's time
to take a deep breath and be a bit clearer in my description.
The issue arises when the PK of the parent is made up of more than 1 field.
F
That's the problem with answering without thinking the answer completely
through. My example below was only looking at the relationship from the
child table side. If you look at it from the parent table side and add
predicates - probably the more frequently used scenario - the problem
becomes c
Do you have concrete numbers. It''s been a while that I did my tests for
the paper and I would have to dig out my old testcases, but I was left with
the impression that the join cardinality "formula" was derived from
parent-child relationship joins and that the cardinality estimates are OK
for
Is there any way to get the CBO (8.1.7) to recognize parent/child
relationships? This seems to be an extreme example of the Join Independence
Assumption Fallacy (or I guess maybe the Predicate Independence Assumption)
discussed by Wolfgang Breitling in "Fallacies of the Cost Based Optimizer".
If I