Csaba Ringhofer created IMPALA-13048:
----------------------------------------
Summary: Shuffle hint on joins is ignored in some cases
Key: IMPALA-13048
URL: https://issues.apache.org/jira/browse/IMPALA-13048
Project: IMPALA
Issue Type: Bug
Reporter: Csaba Ringhofer
I noticed that shuffle hint is ignore without any warning in some cases
shuffle hint is not applied in this query:
{code}
explain select * from alltypestiny a2 join /* +SHUFFLE */ alltypes a1 on
a1.id=a2.id join alltypessmall a3 on a2.tinyint_col=a3.tinyint_col;
{code}
result plan
{code}
PLAN-ROOT SINK
|
07:EXCHANGE [UNPARTITIONED]
|
04:HASH JOIN [INNER JOIN, BROADCAST]
| hash predicates: a3.tinyint_col = a2.tinyint_col
| runtime filters: RF000 <- a2.tinyint_col
| row-size=267B cardinality=80
|
|--06:EXCHANGE [BROADCAST]
| |
| 03:HASH JOIN [INNER JOIN, BROADCAST]
| | hash predicates: a1.id = a2.id
| | runtime filters: RF002 <- a2.id
| | row-size=178B cardinality=8
| |
| |--05:EXCHANGE [BROADCAST]
| | |
| | 00:SCAN HDFS [functional.alltypestiny a2]
| | HDFS partitions=4/4 files=4 size=460B
| | row-size=89B cardinality=8
| |
| 01:SCAN HDFS [functional.alltypes a1]
| HDFS partitions=24/24 files=24 size=478.45KB
| runtime filters: RF002 -> a1.id
| row-size=89B cardinality=7.30K
|
02:SCAN HDFS [functional.alltypessmall a3]
HDFS partitions=4/4 files=4 size=6.32KB
runtime filters: RF000 -> a3.tinyint_col
row-size=89B cardinality=100
{code}
if the first two tables' position is swapped, then it is applied:
{code}
explain select * from alltypes a1 join /* +SHUFFLE */ alltypestiny a2 on
a1.id=a2.id join alltypessmall a3 on a2.tinyint_col=a3.tinyint_col;
{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)