Paul Rogers created IMPALA-8218:
-----------------------------------

             Summary: Use "simple bin model" to estimate M:N, FK join 
cardinality
                 Key: IMPALA-8218
                 URL: https://issues.apache.org/jira/browse/IMPALA-8218
             Project: IMPALA
          Issue Type: Improvement
          Components: Frontend
    Affects Versions: Impala 3.1.0
            Reporter: Paul Rogers
            Assignee: Paul Rogers


When computing join cardinality, the planner must determine how much a filters 
reduce the cardinality of join keys. For example, in a M:1 (FK/PK) join, 
filtering on the left (M, FK) side will reduce the number of rows available to 
join. How much does that filtering reduce the keys?

The "selectivity" of a filter is the probability that any one row will pass 
through the filter:

{noformat}
|T'| = |T| * sel(f)

sel(f) = |T'| / |T|
{noformat}

Where:

* \{{|T|}} is the cardinality of some table or relation T.
* \{{|T'|}} is the cardinality of the new relation T' after filtering.
* \{{sel(f)}} is the selectivity of some filter f.

The current model makes the standard assumption that the NDV of key columns 
reduces by the same amount as the table cardinality. That is:

{noformat}
|T.k'| = |T.k| * sel(f)
{noformat}

This assumption is incorrect as explained below. The correct expression is 
explained in Swami & Schiefer (S&S), [On the Estimation of Join Result 
Sizes|https://pdfs.semanticscholar.org/2735/672262940a23cfce8d4cc1e25c0191de7da7.pdf]
 (S&S), section 5.


h4. Motivation

It is easiest to reason about these issues in the M:1 (FK/PK) case, but the 
logic also applies to both sides of the M:N (many-to-many, generic) case.

Consider an example. We have a M:1 relationship with two tables: D (for detail) 
and M (for master). A M:1 relationship exists:

{noformat}
D.fk --> M.pk
{noformat}

The foreign key (fk) in the detail table points to the primary key (pk) in the 
master table. Note that keys in the master table are unique, so the simple 
filtering expression works fine:

{noformat}
|M.pk'| = |M.pk| * sel(f)
{noformat}

On the detail side, there are multiple rows that have the same key. What we 
want to know is, when we apply the filtering above, how many of the foreign 
keys survive? Since there are multiple keys, the answer is not obvious. To see, 
this, let's assume that the detail table has 1000 rows, and uses only one FK 
value. If we select 500 rows, how many foreign keys are left? Obviously, the 
answer is that there is still one FK value.


h4. Correlated Filtering

There is one case where simple filtering is the right answer: if the filter is 
on the key column itself. Suppose we know that the left (detail) scan applied a 
filter on the foreign key column. Maybe \{{D.fk = 123}}. In this case (as 
explained in the S&S paper above) we know that \{{|D.fk'| = 1}}. In the general 
case, we may have any operator (!=, <, etc.) in the filter so the cardinality 
of the foreign key is simply the result of applying that filter:

{noformat}
|D.fk'| = |D.fk| * sel(f)
{noformat}


h4. Uncorrelated Filtering

In the next case, we filter on columns other than the primary key column. For 
simplicity, let us adopt the uniformity assumption from S&S paper: that 
applying a filter on a column other than the key results in a random sampling 
of the key column. As explained in S&S, section 5, we must use the "simple urn 
model":

{noformat}
|T.k'| = urn( |T.k|, |T'| )

url(c, n) = c * (1 - (1 - 1/c)^n)
{noformat}

Where:

* \{{urn(c, n)}} is the urn model expression which gives the estimated 
cardinality of a column as the result of a scan
* \{{c}} is the cardinality of a key column
* \{{n}} is the cardinality of the output relation of a scan

See the paper for the reasoning behind this expression.


h4. Combined Filtering

To make the above work, we observe that, in relational theory, we get the same 
result whether we apply all filters in one go, or apply them one-by-one. We get 
the same result if we apply the filters during the scan (as Impala does), or by 
first scanning all rows, then applying the filters afterward (as some other 
engines do.)

This observation allows us to break the calculation into two parts:

* Determine the key NDV and table cardinality produced by just the correlated 
filters.
* Use the urn model to predict key cardinality from the uncorrelated filters.

We start by sorting scan filters into two categories:

* Correlated filters applied to a given key column (here, \{{D.fk}}).
* All other filters (the uncorrelated filters.)

We do this by first gathering all predicates applied to the detail (left) 
relation. The detail for doing so is explained in IMPALA-8217 which describes 
the need to adjust for filters applied to both sides of a join. Given that set, 
we can do the following:

* Remove from the set all those predicates which reference \{{D.fk}}. These are 
the correlated predicates.
* Remove correlated filtering from the scan cardinality to get the cardinality 
due just to the uncorrelated predicates.
* Apply correlated filtering to the key column NDV to get the reduced NDV after 
filtering.
* Apply the urn model to compute key cardinality after uncorrelated filtering.

Suppose we have the following:

* \{{F(D)}}: the set of all filters applied to the scan of D: \{{(f1(D), f2(D), 
... fn(D))}}.
* \{{F(D.fk)}}: the subset of the above that apply to only the key column 
\{{D.fk}}.

Then we can compute or define the selectivity of these filters:

{noformat}
sel(scan(D)) = ∏ sel(Fi(D))

= |D'| / |D|

sel(F(D.fk)) = ∏ sel(Fi(D.fk))
{noformat}

Next we can work out the cardinalities required for inputs to the urn model:

{noformat}
|D.fk''| = |D.fk| * sel(F(D.fk))

|D''| = |D| * sel(F(D.fk))

= |D'| / sel(scan(D) * sel(F(D.fk))
{noformat}

Where:

* \{{|D.fk''|}} is the key cardinality obtained by applying just the correlated 
filters.
* \{{|D''|}} is the scan cardinality that would result if we applied only the 
correlated filters.

Then, we can apply the uncorrelated filtering using the urn model:

{noformat}
|D.fk'| = urn( |D.fk''|, |D''| )
{noformat}


h4. Compound Keys

The discussion above is for simple keys. If a key is compound (see IMPALA-8014) 
we can refine the above as follows:

* Compute each column in the key as described above.
* Multiply the adjusted NDV's to get the cardinality of the key as a whole, 
adjusting as explained in IMPALA-8014.


h4. Complication: Compound Joins

The calculation above is sound, but is complicated by several factors in 
Impala's implementation. First is that the left side of a join is usually not a 
table; it is usually another join. (This is a result of the left-deep pattern 
adopted by most query optimizers.) Because of this, we actually do not know the 
original table cardinality which we've called \{{|D|}}. Yes, we do know the 
size of the tables that went into a join, and we know the size of the relation 
that comes out of the join, but there is no reasonable number for the base 
table since there are multiple.

Instead, the above calculations have been based on what we do know:

* \{{|T'|}} the result of all filtering (and joins). This is the cardinality 
input to the present join.
* \{{F}}, the set of all filters applied anywhere in the subtree below a join 
input.
* \{{|D.fk|}}, the key column NDV as reported from HMS stats.

Using just these values we can calculate key cardinality as:

{noformat}
|D.fk''| = |D.fk| * sel(F(D.fk))

|D''| = |D'| / sel(scan(D) * sel(F(D.fk))

|D.fk'| = urn( |D.fk''|, |D''| )
{noformat}


h4. Tracking Adjusted NDV

The above recomputes adjusted NDV at each join from first principals. It is 
possible to simplify the task by working only one level at a time. Each scan or 
join would maintain a running adjusted NDV for each column:

* The scan node starts with the input NDVs as reported by HMS.
* The scan node computes output NDVs by applying filters to each column and 
tracking the result.
* The join node uses the scan output NDVs as |T.k''| (the NDV after correlated 
filtering.)
* The join node tracks its own output NDVs that result from applying 
uncorrelated filtering.

With this approach, changes bubble up the operator tree one step at a time. 
Debugging is easier since we can visualize the adjusted NDVs.

In the approach described in above sections, we track the combined set of all 
filters applied up the tree. With the approach described in this section, we 
track the result of applying the filters rather than the filters themselves.


h4. Complication: Non-Linear Filter Combination

The approach above assumes we can apply filters in any order which is a 
foundational assumption of relational theory. Unfortunately, Impala takes a 
different approach: it uses an exponential back-off:

{noformat}
|T'| = |T| * ∏(i =0..) Fi^i
{noformat}

This means that the filters cannot be applied in any order, nor can we easily 
back one filter out of the combined filter.

Impala does this to compensate, in part, for the fact that Impala does not 
compute selectivity for any but simple \{{col = const}} predicate.

To allow the calculations above, we must remove the exponential back-off, which 
requires computing selectivity for all predicates -- something that is a good 
idea anyway. See IMPALA-8217.


h4. Generalizing to the M:N (Generic) Case

The discussion above discusses foreign keys in a M:1 join. The same logic 
applies to both sides of a join in a M:N (many-to-many, "generic") join. 
Calculations for the right-side table are a bit simpler because, in Impala, the 
right input is always a base table.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to