[
https://issues.apache.org/jira/browse/IMPALA-8218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Paul Rogers reassigned IMPALA-8218:
-----------------------------------
Assignee: (was: Paul Rogers)
> Use "simple bin model" to estimate M:N, FK join cardinality
> -----------------------------------------------------------
>
> Key: IMPALA-8218
> URL: https://issues.apache.org/jira/browse/IMPALA-8218
> Project: IMPALA
> Issue Type: Improvement
> Components: Frontend
> Affects Versions: Impala 3.1.0
> Reporter: Paul Rogers
> Priority: Minor
>
> When computing join cardinality, the planner must determine how much a
> filters reduce the cardinality of join keys. For example, in a M:1 (FK/PK)
> join, filtering on the left (M, FK) side will reduce the number of rows
> available to join. How much does that filtering reduce the keys?
> The "selectivity" of a filter is the probability that any one row will pass
> through the filter:
> {noformat}
> |T'| = |T| * sel(f)
> sel(f) = |T'| / |T|
> {noformat}
> Where:
> * \{{|T|}} is the cardinality of some table or relation T.
> * \{{|T'|}} is the cardinality of the new relation T' after filtering.
> * \{{sel(f)}} is the selectivity of some filter f.
> The current model makes the standard assumption that the NDV of key columns
> reduces by the same amount as the table cardinality. That is:
> {noformat}
> |T.k'| = |T.k| * sel(f)
> {noformat}
> This assumption is incorrect as explained below. The correct expression is
> explained in Swami & Schiefer (S&S), [On the Estimation of Join Result
> Sizes|https://pdfs.semanticscholar.org/2735/672262940a23cfce8d4cc1e25c0191de7da7.pdf]
> (S&S), section 5.
> h4. Motivation
> It is easiest to reason about these issues in the M:1 (FK/PK) case, but the
> logic also applies to both sides of the M:N (many-to-many, generic) case.
> Consider an example. We have a M:1 relationship with two tables: D (for
> detail) and M (for master). A M:1 relationship exists:
> {noformat}
> D.fk --> M.pk
> {noformat}
> The foreign key (fk) in the detail table points to the primary key (pk) in
> the master table. Note that keys in the master table are unique, so the
> simple filtering expression works fine:
> {noformat}
> |M.pk'| = |M.pk| * sel(f)
> {noformat}
> On the detail side, there are multiple rows that have the same key. What we
> want to know is, when we apply the filtering above, how many of the foreign
> keys survive? Since there are multiple keys, the answer is not obvious. To
> see, this, let's assume that the detail table has 1000 rows, and uses only
> one FK value. If we select 500 rows, how many foreign keys are left?
> Obviously, the answer is that there is still one FK value.
> h4. Correlated Filtering
> There is one case where simple filtering is the right answer: if the filter
> is on the key column itself. Suppose we know that the left (detail) scan
> applied a filter on the foreign key column. Maybe \{{D.fk = 123}}. In this
> case (as explained in the S&S paper above) we know that \{{|D.fk'| = 1}}. In
> the general case, we may have any operator (!=, <, etc.) in the filter so the
> cardinality of the foreign key is simply the result of applying that filter:
> {noformat}
> |D.fk'| = |D.fk| * sel(f)
> {noformat}
> h4. Uncorrelated Filtering
> In the next case, we filter on columns other than the primary key column. For
> simplicity, let us adopt the uniformity assumption from S&S paper: that
> applying a filter on a column other than the key results in a random sampling
> of the key column. As explained in S&S, section 5, we must use the "simple
> urn model":
> {noformat}
> |T.k'| = urn( |T.k|, |T'| )
> url(c, n) = c * (1 - (1 - 1/c)^n)
> {noformat}
> Where:
> * \{{urn(c, n)}} is the urn model expression which gives the estimated
> cardinality of a column as the result of a scan
> * \{{c}} is the cardinality of a key column
> * \{{n}} is the cardinality of the output relation of a scan
> See the paper for the reasoning behind this expression.
> h4. Combined Filtering
> To make the above work, we observe that, in relational theory, we get the
> same result whether we apply all filters in one go, or apply them one-by-one.
> We get the same result if we apply the filters during the scan (as Impala
> does), or by first scanning all rows, then applying the filters afterward (as
> some other engines do.)
> This observation allows us to break the calculation into two parts:
> * Determine the key NDV and table cardinality produced by just the correlated
> filters.
> * Use the urn model to predict key cardinality from the uncorrelated filters.
> We start by sorting scan filters into two categories:
> * Correlated filters applied to a given key column (here, \{{D.fk}}).
> * All other filters (the uncorrelated filters.)
> We do this by first gathering all predicates applied to the detail (left)
> relation. The detail for doing so is explained in IMPALA-8217 which describes
> the need to adjust for filters applied to both sides of a join. Given that
> set, we can do the following:
> * Remove from the set all those predicates which reference \{{D.fk}}. These
> are the correlated predicates.
> * Remove correlated filtering from the scan cardinality to get the
> cardinality due just to the uncorrelated predicates.
> * Apply correlated filtering to the key column NDV to get the reduced NDV
> after filtering.
> * Apply the urn model to compute key cardinality after uncorrelated filtering.
> Suppose we have the following:
> * \{{F(D)}}: the set of all filters applied to the scan of D: \{{(f1(D),
> f2(D), ... fn(D))}}.
> * \{{F(D.fk)}}: the subset of the above that apply to only the key column
> \{{D.fk}}.
> Then we can compute or define the selectivity of these filters:
> {noformat}
> sel(scan(D)) = ∏ sel(Fi(D))
> = |D'| / |D|
> sel(F(D.fk)) = ∏ sel(Fi(D.fk))
> {noformat}
> Next we can work out the cardinalities required for inputs to the urn model:
> {noformat}
> |D.fk''| = |D.fk| * sel(F(D.fk))
> |D''| = |D| * sel(F(D.fk))
> = |D'| / sel(scan(D) * sel(F(D.fk))
> {noformat}
> Where:
> * \{{|D.fk''|}} is the key cardinality obtained by applying just the
> correlated filters.
> * \{{|D''|}} is the scan cardinality that would result if we applied only the
> correlated filters.
> Then, we can apply the uncorrelated filtering using the urn model:
> {noformat}
> |D.fk'| = urn( |D.fk''|, |D''| )
> {noformat}
> h4. Compound Keys
> The discussion above is for simple keys. If a key is compound (see
> IMPALA-8014) we can refine the above as follows:
> * Compute each column in the key as described above.
> * Multiply the adjusted NDV's to get the cardinality of the key as a whole,
> adjusting as explained in IMPALA-8014.
> h4. Complication: Compound Joins
> The calculation above is sound, but is complicated by several factors in
> Impala's implementation. First is that the left side of a join is usually not
> a table; it is usually another join. (This is a result of the left-deep
> pattern adopted by most query optimizers.) Because of this, we actually do
> not know the original table cardinality which we've called \{{|D|}}. Yes, we
> do know the size of the tables that went into a join, and we know the size of
> the relation that comes out of the join, but there is no reasonable number
> for the base table since there are multiple.
> Instead, the above calculations have been based on what we do know:
> * \{{|T'|}} the result of all filtering (and joins). This is the cardinality
> input to the present join.
> * \{{F}}, the set of all filters applied anywhere in the subtree below a join
> input.
> * \{{|D.fk|}}, the key column NDV as reported from HMS stats.
> Using just these values we can calculate key cardinality as:
> {noformat}
> |D.fk''| = |D.fk| * sel(F(D.fk))
> |D''| = |D'| / sel(scan(D) * sel(F(D.fk))
> |D.fk'| = urn( |D.fk''|, |D''| )
> {noformat}
> h4. Tracking Adjusted NDV
> The above recomputes adjusted NDV at each join from first principals. It is
> possible to simplify the task by working only one level at a time. Each scan
> or join would maintain a running adjusted NDV for each column:
> * The scan node starts with the input NDVs as reported by HMS.
> * The scan node computes output NDVs by applying filters to each column and
> tracking the result.
> * The join node uses the scan output NDVs as |T.k''| (the NDV after
> correlated filtering.)
> * The join node tracks its own output NDVs that result from applying
> uncorrelated filtering.
> With this approach, changes bubble up the operator tree one step at a time.
> Debugging is easier since we can visualize the adjusted NDVs.
> In the approach described in above sections, we track the combined set of all
> filters applied up the tree. With the approach described in this section, we
> track the result of applying the filters rather than the filters themselves.
> h4. Complication: Non-Linear Filter Combination
> The approach above assumes we can apply filters in any order which is a
> foundational assumption of relational theory. Unfortunately, Impala takes a
> different approach: it uses an exponential back-off:
> {noformat}
> |T'| = |T| * ∏(i =0..) Fi^i
> {noformat}
> This means that the filters cannot be applied in any order, nor can we easily
> back one filter out of the combined filter.
> Impala does this to compensate, in part, for the fact that Impala does not
> compute selectivity for any but simple \{{col = const}} predicate.
> To allow the calculations above, we must remove the exponential back-off,
> which requires computing selectivity for all predicates -- something that is
> a good idea anyway. See IMPALA-8217.
> h4. Generalizing to the M:N (Generic) Case
> The discussion above discusses foreign keys in a M:1 join. The same logic
> applies to both sides of a join in a M:N (many-to-many, "generic") join.
> Calculations for the right-side table are a bit simpler because, in Impala,
> the right input is always a base table.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]