[ https://issues.apache.org/jira/browse/KYLIN-4762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xiaoxiang Yu resolved KYLIN-4762. --------------------------------- Resolution: Fixed > Optimize join where there is the same shardby partition num on join key > ----------------------------------------------------------------------- > > Key: KYLIN-4762 > URL: https://issues.apache.org/jira/browse/KYLIN-4762 > Project: Kylin > Issue Type: Improvement > Components: Query Engine > Affects Versions: v4.0.0-alpha > Reporter: Zhichao Zhang > Assignee: Zhichao Zhang > Priority: Minor > Fix For: v4.0.0 > > Attachments: shardby_join.png > > > Optimize join by reducing shuffle when there is the same shard by partition > number on join key. > When execute this sql, > {code:java} > // code placeholder > select m.seller_id, m.part_dt, sum(m.price) as s > from kylin_sales m > left join ( > select m1.part_dt as pd, count(distinct m1.SELLER_ID) as m1, count(1) as m2 > > from kylin_sales m1 > where m1.part_dt = '2012-01-05' > group by m1.part_dt > ) j > on m.part_dt = j.pd > where m.lstg_format_name = 'FP-GTC' > and m.part_dt = '2012-01-05' > group by m.seller_id, m.part_dt limit 100; > {code} > the execution plan is shown below: > !shardby_join.png! > But the join key part_dt has the same shard by partition number, it can be > optimized to reduce shuffle, similar to bucket join. -- This message was sent by Atlassian Jira (v8.3.4#803005)