[ 
https://issues.apache.org/jira/browse/FLINK-7730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16187397#comment-16187397
 ] 

Xingcan Cui edited comment on FLINK-7730 at 10/1/17 2:24 PM:
-------------------------------------------------------------

Hi [~fhueske], I've checked the implementation and got some clues in the SQL 
level. For instance, given a simple table {{WordCount(word:String, 
frequency:Int)}}, a table function {{split: word:String => (letter:String, 
length:String)}}, and a SQL like that
{code:sql}
SELECT word, letter, length
FROM WordCount
LEFT JOIN LATERAL TABLE(split(word)) AS T (letter, length) ON frequency = 
length OR length < 5
{code}
, the query will be translated to the logical plan below.
{code}
LogicalProject(word=[$0], name=[$2], length=[$3])
  LogicalFilter(condition=[OR(=($1, CAST($3):BIGINT), <($3, 5))])
    LogicalCorrelate(correlation=[$cor0], joinType=[left], 
requiredColumns=[{0}])
      LogicalTableScan(table=[[WordCount]])
      LogicalTableFunctionScan(invocation=[split($cor0.word)], 
rowType=[RecordType(VARCHAR(65536) _1, INTEGER _2)], elementType=[class 
[Ljava.lang.Object;])
{code}
Apparently, this logical plan will lead to an improper physical plan, which 
first correlates each row with its table function results in the 
{{CorrelateFlatMapRunner}} (just like performing a cartesian product) and then 
filters the rows in a {{FlatMapFunction}}. It only works for inner join, but 
not for left outer join.

Under this circumstance, could you give some suggestions on fixing this? BTW, I 
find it's hard to debug the dynamically generated functions (the IDE can not 
locate the source), is there some special techniques that can be used to deal 
with them?


was (Author: xccui):
Hi [~fhueske], I've checked the implementation and got some clues in the SQL 
level. For instance, given a simple table {{WordCount(word:String, 
frequency:Int)}}, a table function {{split: word:String => (letter:String, 
length:String)}}, and a SQL like that
{code:sql}
SELECT word, name, length
FROM WordCount
LEFT JOIN LATERAL TABLE(split(word)) AS T (letter, length) ON frequency = 
length OR length < 5
{code}
, the query will be translated to the logical plan below.
{code}
LogicalProject(word=[$0], name=[$2], length=[$3])
  LogicalFilter(condition=[OR(=($1, CAST($3):BIGINT), <($3, 5))])
    LogicalCorrelate(correlation=[$cor0], joinType=[left], 
requiredColumns=[{0}])
      LogicalTableScan(table=[[WordCount]])
      LogicalTableFunctionScan(invocation=[split($cor0.word)], 
rowType=[RecordType(VARCHAR(65536) _1, INTEGER _2)], elementType=[class 
[Ljava.lang.Object;])
{code}
Apparently, this logical plan will lead to an improper physical plan, which 
first correlates each row with its table function results in the 
{{CorrelateFlatMapRunner}} (just like performing a cartesian product) and then 
filters the rows in a {{FlatMapFunction}}. It only works for inner join, but 
not for left outer join.

Under this circumstance, could you give some suggestions on fixing this? BTW, I 
find it's hard to debug the dynamically generated functions (the IDE can not 
locate the source), is there some special techniques that can be used to deal 
with them?

> TableFunction LEFT OUTER joins with ON predicates are broken
> ------------------------------------------------------------
>
>                 Key: FLINK-7730
>                 URL: https://issues.apache.org/jira/browse/FLINK-7730
>             Project: Flink
>          Issue Type: Bug
>          Components: Table API & SQL
>    Affects Versions: 1.4.0, 1.3.2
>            Reporter: Fabian Hueske
>            Assignee: Xingcan Cui
>
> TableFunction left outer joins with predicates in the ON clause are broken. 
> Apparently, the are no tests for this and it has never worked. I observed 
> issues on several layers:
> - Table Function does not correctly validate equality predicate: 
> {{leftOuterJoin(func1('c) as 'd,  'a.cast(Types.STRING) === 'd)}} is rejected 
> because the predicate is not considered as an equality predicate (the cast 
> needs to be pushed down).
> - Plans cannot be correctly translated: {{leftOuterJoin(func1('c) as 'd,  'c 
> === 'd)}} gives an optimizer exception.
> - SQL queries get translated but produce incorrect results. For example 
> {{SELECT a, b, c, d FROM MyTable LEFT OUTER JOIN LATERAL TABLE(tfunc(c)) AS 
> T(d) ON d = c}} returns an empty result if the condition {{d = c}} never 
> returns true. However, the outer side should be preserved and padded with 
> nulls.
> So there seem to be many issues with table function outer joins. Especially, 
> the wrong result produced by SQL queries need to be quickly fixed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to