[
https://issues.apache.org/jira/browse/CALCITE-6236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17813245#comment-17813245
]
Ulrich Kramer commented on CALCITE-6236:
----------------------------------------
Falling back to {{Join.estimateRowCount}} or {{Correlate.estimateRowCount}}
will not fix the issue.
As I tried to explain in the description of this issue, the problem comes with
the additional {{Filter}} relations which is created by the
{{EnumerableBatchNestedLoopJoinRule}}. This relation will not exist without the
application of this rule.
It's not taken into account that this filter reduces the {{estimatedRowCount}}
by a factor of 4.
Therefore, {{Correlate.estimateRowCount}} will also return a value which is far
too low.
> EnumerableBatchNestedLoopJoin uses wrong row count for cost calculation
> -----------------------------------------------------------------------
>
> Key: CALCITE-6236
> URL: https://issues.apache.org/jira/browse/CALCITE-6236
> Project: Calcite
> Issue Type: Bug
> Reporter: Ulrich Kramer
> Priority: Major
> Labels: pull-request-available
>
> {{EnumerableBatchNestedLoopJoin}} always adds a {{Filter}} on the right
> relation.
> This filter reduces the number of rows by it's selectivity (in our case by a
> factor of 4).
> Therefore, {{RelMdUtil.getJoinRowCount}} returns a value 4 times lower
> compared to the one returned for a {{JdbcJoin}}.
> This leads to the fact that in most cases {{EnumerableBatchNestedLoopJoin}}
> is preferred over {{JdbcJoin}}.
> This is an example for the different costs
> {code}
> EnumerableProject rows=460.0 self_costs=460.0 cumulative_costs=1465.0
> EnumerableBatchNestedLoopJoin rows=460.0 self_costs=687.5
> cumulative_costs=1005.0
> JdbcToEnumerableConverter rows=100.0 self_costs=10.0
> cumulative_costs=190.0
> JdbcProject rows=100.0 self_costs=80.0 cumulative_costs=180.0
> JdbcTableScan rows=100.0 self_costs=100.0 cumulative_costs=100.0
> JdbcToEnumerableConverter rows=25.0 self_costs=2.5 cumulative_costs=127.5
> JdbcFilter rows=25.0 self_costs=25.0 cumulative_costs=125.0
> JdbcTableScan rows=100.0 self_costs=100.0 cumulative_costs=100.0
> {code}
> vs.
> {code}
> JdbcToEnumerableConverter rows=1585.0 self_costs=158.5 cumulative_costs=2023.5
> JdbcJoin rows=1585.0 self_costs=1585.0 cumulative_costs=1865.0
> JdbcProject rows=100.0 self_costs=80.0 cumulative_costs=180.0
> JdbcTableScan rows=100.0 self_costs=100.0 cumulative_costs=100.0
> JdbcTableScan rows=100.0 self_costs=100.0 cumulative_costs=100.0
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)