[ 
https://issues.apache.org/jira/browse/CALCITE-4208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189054#comment-17189054
 ] 

Ruben Q L commented on CALCITE-4208:
------------------------------------

I have this new proposal:
{code:java}
selectivity = mq.getSelectivity(join, condition);
innerRowCount = left * right * selectivity; // unchanged
leftRowCount = left * (1D - selectivity) + innerRowCount;
rightRowCount = right * (1D - selectivity) + innerRowCount;
fullRowCount = left * (1D - selectivity) + right * (1D - selectivity) + 
innerRowCount = (left + right) * (1D - selectivity) + innerRowCount;
{code}
which would seem more accurate than the current computations (but nothing too 
revolutionary). It does not cause much regression on our tests (just a couple 
of them would need to get adjusted)

> Improve metadata row count for Join
> -----------------------------------
>
>                 Key: CALCITE-4208
>                 URL: https://issues.apache.org/jira/browse/CALCITE-4208
>             Project: Calcite
>          Issue Type: Improvement
>          Components: core
>            Reporter: Ruben Q L
>            Priority: Major
>
> Currently, the default metadata row count for join 
> {{RelMdRowCount#getRowCount(Join rel, RelMetadataQuery mq)}} relies on 
> {{RelMdUtil.getJoinRowCount}}. This method has several issues:
>  - In case of ANTI join, it returns the same estimation as a SEMI join
>  - In other cases (INNER, LEFT, RIGHT, FULL), it returns always the same 
> formula:
>  {{leftRowCount * rightRowCount * mq.getSelectivity(join, condition)}}
>  which seems valid for an INNER join, but not for LEFT / RIGHT / FULL.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to