[
https://issues.apache.org/jira/browse/FLINK-22113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17315938#comment-17315938
]
wangzhihao commented on FLINK-22113:
------------------------------------
[~jark] Will it affect the correctness, or just performance? More specifically,
in the internal state MapState<row, count> of a multiple join case: A join B
join C, will the count always = 1, or will it increase gradually?
{quote}1) input doesn't have a unique key => MapState<row, count>,
where the map key is the input row and the map value is the number of equal
rows.{quote}
> UniqueKey constraint is lost with multiple sources join in SQL
> --------------------------------------------------------------
>
> Key: FLINK-22113
> URL: https://issues.apache.org/jira/browse/FLINK-22113
> Project: Flink
> Issue Type: Bug
> Components: Table SQL / Planner
> Affects Versions: 1.13.0
> Reporter: Fu Kai
> Priority: Major
>
> Hi team,
>
> We have a use case to join multiple data sources to generate a continuous
> updated view. We defined primary key constraint on all the input sources and
> all the keys are the subsets in the join condition. All joins are left join.
>
> In our case, the first two inputs can produce *JoinKeyContainsUniqueKey*
> input sepc, which is good and performant. While when it comes to the third
> input source, it's joined with the intermediate output table of the first two
> input tables, and the intermediate table does not carry key constraint
> information(although the thrid source input table does), so it results in a
> *NoUniqueKey* input sepc. Given NoUniqueKey inputs has dramatic performance
> implications per the[ Force Join Unique
> Key|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Force-Join-Unique-Key-td39521.html#a39651]
> email thread, we want to know if there is any mitigation solution for this.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)