On 16 July 2018 at 13:23, Tomas Vondra <[email protected]> wrote: >>> The top-level clauses allow us to make such deductions, with deeper >>> clauses it's much more difficult (perhaps impossible). Because for >>> example with (a=1 AND b=1) there can be just a single match, so if we >>> find it in MCV we're done. With clauses like ((a=1 OR a=2) AND (b=1 OR >>> b=2)) it's not that simple, because there may be multiple combinations >>> and so a match in MCV does not guarantee anything. >> >> Actually, it guarantees a lower bound on the overall selectivity, and >> maybe that's the best that we can do in the absence of any other >> stats. >> > Hmmm, is that actually true? Let's consider a simple example, with two > columns, each with just 2 values, and a "perfect" MCV list: > > a | b | frequency > ------------------- > 1 | 1 | 0.5 > 2 | 2 | 0.5 > > And let's estimate sel(a=1 & b=2).
OK.In this case, there are no MCV matches, so there is no lower bound (it's 0). What we could do though is also impose an upper bound, based on the sum of non-matching MCV frequencies. In this case, the upper bound is also 0, so we could actually say the resulting selectivity is 0. Regards, Dean
