[
https://issues.apache.org/jira/browse/CALCITE-3571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated CALCITE-3571:
------------------------------------
Labels: pull-request-available (was: )
> RelBuilder#shouldMergeProject throws an exception for JOIN with complex
> conditions
> ----------------------------------------------------------------------------------
>
> Key: CALCITE-3571
> URL: https://issues.apache.org/jira/browse/CALCITE-3571
> Project: Calcite
> Issue Type: Bug
> Components: core
> Affects Versions: 1.21.0
> Reporter: Kirill Kozlov
> Priority: Major
> Labels: pull-request-available
>
> Before joins are created left and right paths are pared first. For the 1st
> query above they are as follows:
> {code:java}
> Left:
> LogicalProject with RecordType(VARBINARY id, VARCHAR fA1, VARCHAR fA1_2)
> LogicalTableScan with RecordType(VARBINARY id, VARCHAR fA1)
> Right:
> LogicalTableScan with RecordType(VARCHAR id, VARCHAR fB1){code}
> As they are processed - they are registered as leaves (added to the Array).
>
> When Join node is being created it knows what the `condition expressions` is:
> {code:java}
> =(TO_HEX($0), $3)
> {code}
> Since TO_HEX is not computed anywhere - it modifies the left input to be as
> follows (via RelOptUtil#pushDownJoinConditions) because
> RelBuilder#shouldMergeProject always return true.
>
> {code:java}
> LogicalProject with RecordType(VARBINARY id, VARCHAR fA1, VARCHAR fA1_2,
> VARCHAR $f3)
> {code}
> where `VARCHAR $f3` is a result of TO_HEX. Note that the list of leaves is
> not updated.
> [https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/sql2rel/SqlToRelConverter.java#L2571]
>
> Finally, when identifier "query.fA1_2" is being converted (via
> SqlToRelConverter#convertIdentifier) for the top-most node
> {code:java}
> top-most node:
> LogicalProject with RecordType(VARBINARY id, VARCHAR fA1, VARCHAR fA1_2,
> VARCHAR id0, VARCHAR fB1)
> LogicalJoin with RecordType(VARBINARY id, VARCHAR fA1, VARCHAR fA1_2,
> VARCHAR $f3, VARCHAR id0, VARCHAR fB1)
> LogicalProject with RecordType(VARBINARY id, VARCHAR fA1, VARCHAR fA1_2,
> VARCHAR $f3)
> LogicalTableScan with RecordType(VARBINARY id, VARCHAR fA1)
> LogicalTableScan with RecordType(VARCHAR id, VARCHAR fB1){code}
> Blackboard perform a lookup (via SqlToRelConverter#lookupExp), in process of
> which LookupContext is created.
> In a constructor, LookupContext performs flatten, which recursively traverses
> tree of nodes (from above codeblock) and checks the leaves to see if they
> contain such expression. When it does get to the modified left input of a
> join it does not get a match on it and continues further down to a TableScan.
> When it finally flattens the result, TableScan's RecordType knows nothing
> about a duplicated field `fA1_2`, causing an error above.
>
> I think a viable solution would be to modify Join creation to register a
> resulting join inputs as leaves (when they are modified). Alternative
> approach would be to not merge Projects when join needs to modify an input.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)