[
https://issues.apache.org/jira/browse/FLINK-31750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713024#comment-17713024
]
ZhengYi Weng commented on FLINK-31750:
--------------------------------------
It is not a bug. If duplicate hash keys are removed, the hash code on both
sides of the join are inconsistent, resulting in partitioning errors.
> Hash Keys are duplicate when join reorder happens in stream mode
> ----------------------------------------------------------------
>
> Key: FLINK-31750
> URL: https://issues.apache.org/jira/browse/FLINK-31750
> Project: Flink
> Issue Type: Bug
> Components: Table SQL / Planner
> Affects Versions: 1.17.0, 1.16.1
> Reporter: ZhengYi Weng
> Assignee: ZhengYi Weng
> Priority: Major
> Attachments: image-2023-04-07-10-39-13-831.png
>
>
> When I run `JoinReorderTestBase#testAllInnerJoin` in the case that
> isBushyJoinReorder is false, I find hash keys are duplicate.
> !image-2023-04-07-10-39-13-831.png|width=571,height=263!
> The reason why it happens is that when join reorder, the join condition will
> change and generate the same column condition, for example,the condition of
> T1 join(T4 join T5)is a1 = a4 and a1 = a5. It can de fixed if columns in
> `StreamPhysicalJoinRuleBase#onMatch#toHashTraitByColumns` are not duplicate.
> I will fix it.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)