[jira] [Updated] (IMPALA-8045) Rollup of Smaller Join Cardinality Issues

Paul Rogers (JIRA) Thu, 03 Jan 2019 19:09:20 -0800


     [ 
https://issues.apache.org/jira/browse/IMPALA-8045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Paul Rogers updated IMPALA-8045:
--------------------------------
    Summary: Rollup of Smaller Join Cardinality Issues  (was: ScanNode 
confusion between table and scan input cardinality)

> Rollup of Smaller Join Cardinality Issues
> -----------------------------------------
>
>                 Key: IMPALA-8045
>                 URL: https://issues.apache.org/jira/browse/IMPALA-8045
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>    Affects Versions: Impala 3.1.0
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>            Priority: Major
>
> The {{ScanNode}} class in the scanner contains an {{inputCardinality_}} field 
> used by join calculations as a proxy for the table size. However, the actual 
> scan node implementations set the {{inputCardinality_}} to the estimated 
> number of rows *read* by the scan, which is useful when understanding the 
> physical scan structure. But, for joins, we need the base table cardinality.
> For example, the join may use the input cardinality to understand the 
> reduction in rows due to filters in order to adjust the NDV of key columns. 
> But, since the input cardinality is the scan count, not the table row count, 
> the math does not work out.
> The solution is to clarify the code to separate the idea of scan count vs. 
> base table row count.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (IMPALA-8045) Rollup of Smaller Join Cardinality Issues

Reply via email to