[
https://issues.apache.org/jira/browse/CALCITE-4208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189054#comment-17189054
]
Ruben Q L commented on CALCITE-4208:
------------------------------------
I have this new proposal:
{code:java}
selectivity = mq.getSelectivity(join, condition);
innerRowCount = left * right * selectivity; // unchanged
leftRowCount = left * (1D - selectivity) + innerRowCount;
rightRowCount = right * (1D - selectivity) + innerRowCount;
fullRowCount = left * (1D - selectivity) + right * (1D - selectivity) +
innerRowCount = (left + right) * (1D - selectivity) + innerRowCount;
{code}
which would seem more accurate than the current computations (but nothing too
revolutionary). It does not cause much regression on our tests (just a couple
of them would need to get adjusted)
> Improve metadata row count for Join
> -----------------------------------
>
> Key: CALCITE-4208
> URL: https://issues.apache.org/jira/browse/CALCITE-4208
> Project: Calcite
> Issue Type: Improvement
> Components: core
> Reporter: Ruben Q L
> Priority: Major
>
> Currently, the default metadata row count for join
> {{RelMdRowCount#getRowCount(Join rel, RelMetadataQuery mq)}} relies on
> {{RelMdUtil.getJoinRowCount}}. This method has several issues:
> - In case of ANTI join, it returns the same estimation as a SEMI join
> - In other cases (INNER, LEFT, RIGHT, FULL), it returns always the same
> formula:
> {{leftRowCount * rightRowCount * mq.getSelectivity(join, condition)}}
> which seems valid for an INNER join, but not for LEFT / RIGHT / FULL.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)