[
https://issues.apache.org/jira/browse/IMPALA-13048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Csaba Ringhofer updated IMPALA-13048:
-------------------------------------
Description:
I noticed that shuffle hint is ignored without any warning in some cases
shuffle hint is not applied in this query:
{code}
explain select * from alltypestiny a2 join /* +SHUFFLE */ alltypes a1 on
a1.id=a2.id join alltypessmall a3 on a2.tinyint_col=a3.tinyint_col;
{code}
result plan
{code}
PLAN-ROOT SINK
|
07:EXCHANGE [UNPARTITIONED]
|
04:HASH JOIN [INNER JOIN, BROADCAST]
| hash predicates: a3.tinyint_col = a2.tinyint_col
| runtime filters: RF000 <- a2.tinyint_col
| row-size=267B cardinality=80
|
|--06:EXCHANGE [BROADCAST]
| |
| 03:HASH JOIN [INNER JOIN, BROADCAST]
| | hash predicates: a1.id = a2.id
| | runtime filters: RF002 <- a2.id
| | row-size=178B cardinality=8
| |
| |--05:EXCHANGE [BROADCAST]
| | |
| | 00:SCAN HDFS [functional.alltypestiny a2]
| | HDFS partitions=4/4 files=4 size=460B
| | row-size=89B cardinality=8
| |
| 01:SCAN HDFS [functional.alltypes a1]
| HDFS partitions=24/24 files=24 size=478.45KB
| runtime filters: RF002 -> a1.id
| row-size=89B cardinality=7.30K
|
02:SCAN HDFS [functional.alltypessmall a3]
HDFS partitions=4/4 files=4 size=6.32KB
runtime filters: RF000 -> a3.tinyint_col
row-size=89B cardinality=100
{code}
if the first two tables' position is swapped, then it is applied:
{code}
explain select * from alltypes a1 join /* +SHUFFLE */ alltypestiny a2 on
a1.id=a2.id join alltypessmall a3 on a2.tinyint_col=a3.tinyint_col;
{code}
was:
I noticed that shuffle hint is ignore without any warning in some cases
shuffle hint is not applied in this query:
{code}
explain select * from alltypestiny a2 join /* +SHUFFLE */ alltypes a1 on
a1.id=a2.id join alltypessmall a3 on a2.tinyint_col=a3.tinyint_col;
{code}
result plan
{code}
PLAN-ROOT SINK
|
07:EXCHANGE [UNPARTITIONED]
|
04:HASH JOIN [INNER JOIN, BROADCAST]
| hash predicates: a3.tinyint_col = a2.tinyint_col
| runtime filters: RF000 <- a2.tinyint_col
| row-size=267B cardinality=80
|
|--06:EXCHANGE [BROADCAST]
| |
| 03:HASH JOIN [INNER JOIN, BROADCAST]
| | hash predicates: a1.id = a2.id
| | runtime filters: RF002 <- a2.id
| | row-size=178B cardinality=8
| |
| |--05:EXCHANGE [BROADCAST]
| | |
| | 00:SCAN HDFS [functional.alltypestiny a2]
| | HDFS partitions=4/4 files=4 size=460B
| | row-size=89B cardinality=8
| |
| 01:SCAN HDFS [functional.alltypes a1]
| HDFS partitions=24/24 files=24 size=478.45KB
| runtime filters: RF002 -> a1.id
| row-size=89B cardinality=7.30K
|
02:SCAN HDFS [functional.alltypessmall a3]
HDFS partitions=4/4 files=4 size=6.32KB
runtime filters: RF000 -> a3.tinyint_col
row-size=89B cardinality=100
{code}
if the first two tables' position is swapped, then it is applied:
{code}
explain select * from alltypes a1 join /* +SHUFFLE */ alltypestiny a2 on
a1.id=a2.id join alltypessmall a3 on a2.tinyint_col=a3.tinyint_col;
{code}
> Shuffle hint on joins is ignored in some cases
> ----------------------------------------------
>
> Key: IMPALA-13048
> URL: https://issues.apache.org/jira/browse/IMPALA-13048
> Project: IMPALA
> Issue Type: Bug
> Reporter: Csaba Ringhofer
> Priority: Major
>
> I noticed that shuffle hint is ignored without any warning in some cases
> shuffle hint is not applied in this query:
> {code}
> explain select * from alltypestiny a2 join /* +SHUFFLE */ alltypes a1 on
> a1.id=a2.id join alltypessmall a3 on a2.tinyint_col=a3.tinyint_col;
> {code}
> result plan
> {code}
> PLAN-ROOT SINK
> |
> 07:EXCHANGE [UNPARTITIONED]
> |
> 04:HASH JOIN [INNER JOIN, BROADCAST]
> | hash predicates: a3.tinyint_col = a2.tinyint_col
> | runtime filters: RF000 <- a2.tinyint_col
> | row-size=267B cardinality=80
> |
> |--06:EXCHANGE [BROADCAST]
> | |
> | 03:HASH JOIN [INNER JOIN, BROADCAST]
> | | hash predicates: a1.id = a2.id
> | | runtime filters: RF002 <- a2.id
> | | row-size=178B cardinality=8
> | |
> | |--05:EXCHANGE [BROADCAST]
> | | |
> | | 00:SCAN HDFS [functional.alltypestiny a2]
> | | HDFS partitions=4/4 files=4 size=460B
> | | row-size=89B cardinality=8
> | |
> | 01:SCAN HDFS [functional.alltypes a1]
> | HDFS partitions=24/24 files=24 size=478.45KB
> | runtime filters: RF002 -> a1.id
> | row-size=89B cardinality=7.30K
> |
> 02:SCAN HDFS [functional.alltypessmall a3]
> HDFS partitions=4/4 files=4 size=6.32KB
> runtime filters: RF000 -> a3.tinyint_col
> row-size=89B cardinality=100
> {code}
> if the first two tables' position is swapped, then it is applied:
> {code}
> explain select * from alltypes a1 join /* +SHUFFLE */ alltypestiny a2 on
> a1.id=a2.id join alltypessmall a3 on a2.tinyint_col=a3.tinyint_col;
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]