[ 
https://issues.apache.org/jira/browse/CALCITE-3571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated CALCITE-3571:
------------------------------------
    Labels: pull-request-available  (was: )

> RelBuilder#shouldMergeProject throws an exception for JOIN with complex 
> conditions
> ----------------------------------------------------------------------------------
>
>                 Key: CALCITE-3571
>                 URL: https://issues.apache.org/jira/browse/CALCITE-3571
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.21.0
>            Reporter: Kirill Kozlov
>            Priority: Major
>              Labels: pull-request-available
>
> Before joins are created left and right paths are pared first. For the 1st 
> query above they are as follows:
> {code:java}
> Left:
> LogicalProject with RecordType(VARBINARY id, VARCHAR fA1, VARCHAR fA1_2)
>   LogicalTableScan with RecordType(VARBINARY id, VARCHAR fA1)
> Right:
> LogicalTableScan with RecordType(VARCHAR id, VARCHAR fB1){code}
> As they are processed - they are registered as leaves (added to the Array).
>  
> When Join node is being created it knows what the `condition expressions` is:
> {code:java}
> =(TO_HEX($0), $3)
> {code}
> Since TO_HEX is not computed anywhere - it modifies the left input to be as 
> follows (via RelOptUtil#pushDownJoinConditions) because 
> RelBuilder#shouldMergeProject always return true.
>  
> {code:java}
> LogicalProject with RecordType(VARBINARY id, VARCHAR fA1, VARCHAR fA1_2, 
> VARCHAR $f3)
> {code}
> where `VARCHAR $f3` is a result of TO_HEX. Note that the list of leaves is 
> not updated.
> [https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/sql2rel/SqlToRelConverter.java#L2571]
>  
> Finally, when identifier "query.fA1_2" is being converted (via 
> SqlToRelConverter#convertIdentifier) for the top-most node
> {code:java}
> top-most node:
> LogicalProject with RecordType(VARBINARY id, VARCHAR fA1, VARCHAR fA1_2, 
> VARCHAR id0, VARCHAR fB1)
>   LogicalJoin with RecordType(VARBINARY id, VARCHAR fA1, VARCHAR fA1_2, 
> VARCHAR $f3, VARCHAR id0, VARCHAR fB1)
>     LogicalProject with RecordType(VARBINARY id, VARCHAR fA1, VARCHAR fA1_2, 
> VARCHAR $f3)
>       LogicalTableScan with RecordType(VARBINARY id, VARCHAR fA1)
>     LogicalTableScan with RecordType(VARCHAR id, VARCHAR fB1){code}
> Blackboard perform a lookup (via SqlToRelConverter#lookupExp), in process of 
> which LookupContext is created.
> In a constructor, LookupContext performs flatten, which recursively traverses 
>  tree of nodes (from above codeblock) and checks the leaves to see if they 
> contain such expression. When it does get to the modified left input of a 
> join it does not get a match on it and continues further down to a TableScan.
> When it finally flattens the result, TableScan's RecordType knows nothing 
> about a duplicated field `fA1_2`, causing an error above.
>  
> I think a viable solution would be to modify Join creation to register a 
> resulting join inputs as leaves (when they are modified). Alternative 
> approach would be to not merge Projects when join needs to modify an input.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to