[ 
https://issues.apache.org/jira/browse/IMPALA-13048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Csaba Ringhofer updated IMPALA-13048:
-------------------------------------
    Description: 
I noticed that shuffle hint is ignored without any warning in some cases

shuffle hint is not applied in this query:

{code}
explain select  * from alltypestiny a2 join /* +SHUFFLE */ alltypes a1 on 
a1.id=a2.id join alltypessmall a3 on a2.tinyint_col=a3.tinyint_col;
{code}
result plan
{code}
PLAN-ROOT SINK
|
07:EXCHANGE [UNPARTITIONED]
|
04:HASH JOIN [INNER JOIN, BROADCAST]
|  hash predicates: a3.tinyint_col = a2.tinyint_col
|  runtime filters: RF000 <- a2.tinyint_col
|  row-size=267B cardinality=80
|
|--06:EXCHANGE [BROADCAST]
|  |
|  03:HASH JOIN [INNER JOIN, BROADCAST]
|  |  hash predicates: a1.id = a2.id
|  |  runtime filters: RF002 <- a2.id
|  |  row-size=178B cardinality=8
|  |
|  |--05:EXCHANGE [BROADCAST]
|  |  |
|  |  00:SCAN HDFS [functional.alltypestiny a2]
|  |     HDFS partitions=4/4 files=4 size=460B
|  |     row-size=89B cardinality=8
|  |
|  01:SCAN HDFS [functional.alltypes a1]
|     HDFS partitions=24/24 files=24 size=478.45KB
|     runtime filters: RF002 -> a1.id
|     row-size=89B cardinality=7.30K
|
02:SCAN HDFS [functional.alltypessmall a3]
   HDFS partitions=4/4 files=4 size=6.32KB
   runtime filters: RF000 -> a3.tinyint_col
   row-size=89B cardinality=100
{code}

if the first two tables' position is swapped, then it is applied:
{code}
explain select  * from alltypes a1 join /* +SHUFFLE */ alltypestiny a2 on 
a1.id=a2.id join alltypessmall a3 on a2.tinyint_col=a3.tinyint_col;
{code}

  was:
I noticed that shuffle hint is ignore without any warning in some cases

shuffle hint is not applied in this query:

{code}
explain select  * from alltypestiny a2 join /* +SHUFFLE */ alltypes a1 on 
a1.id=a2.id join alltypessmall a3 on a2.tinyint_col=a3.tinyint_col;
{code}
result plan
{code}
PLAN-ROOT SINK
|
07:EXCHANGE [UNPARTITIONED]
|
04:HASH JOIN [INNER JOIN, BROADCAST]
|  hash predicates: a3.tinyint_col = a2.tinyint_col
|  runtime filters: RF000 <- a2.tinyint_col
|  row-size=267B cardinality=80
|
|--06:EXCHANGE [BROADCAST]
|  |
|  03:HASH JOIN [INNER JOIN, BROADCAST]
|  |  hash predicates: a1.id = a2.id
|  |  runtime filters: RF002 <- a2.id
|  |  row-size=178B cardinality=8
|  |
|  |--05:EXCHANGE [BROADCAST]
|  |  |
|  |  00:SCAN HDFS [functional.alltypestiny a2]
|  |     HDFS partitions=4/4 files=4 size=460B
|  |     row-size=89B cardinality=8
|  |
|  01:SCAN HDFS [functional.alltypes a1]
|     HDFS partitions=24/24 files=24 size=478.45KB
|     runtime filters: RF002 -> a1.id
|     row-size=89B cardinality=7.30K
|
02:SCAN HDFS [functional.alltypessmall a3]
   HDFS partitions=4/4 files=4 size=6.32KB
   runtime filters: RF000 -> a3.tinyint_col
   row-size=89B cardinality=100
{code}

if the first two tables' position is swapped, then it is applied:
{code}
explain select  * from alltypes a1 join /* +SHUFFLE */ alltypestiny a2 on 
a1.id=a2.id join alltypessmall a3 on a2.tinyint_col=a3.tinyint_col;
{code}


> Shuffle hint on joins is ignored in some cases
> ----------------------------------------------
>
>                 Key: IMPALA-13048
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13048
>             Project: IMPALA
>          Issue Type: Bug
>            Reporter: Csaba Ringhofer
>            Priority: Major
>
> I noticed that shuffle hint is ignored without any warning in some cases
> shuffle hint is not applied in this query:
> {code}
> explain select  * from alltypestiny a2 join /* +SHUFFLE */ alltypes a1 on 
> a1.id=a2.id join alltypessmall a3 on a2.tinyint_col=a3.tinyint_col;
> {code}
> result plan
> {code}
> PLAN-ROOT SINK
> |
> 07:EXCHANGE [UNPARTITIONED]
> |
> 04:HASH JOIN [INNER JOIN, BROADCAST]
> |  hash predicates: a3.tinyint_col = a2.tinyint_col
> |  runtime filters: RF000 <- a2.tinyint_col
> |  row-size=267B cardinality=80
> |
> |--06:EXCHANGE [BROADCAST]
> |  |
> |  03:HASH JOIN [INNER JOIN, BROADCAST]
> |  |  hash predicates: a1.id = a2.id
> |  |  runtime filters: RF002 <- a2.id
> |  |  row-size=178B cardinality=8
> |  |
> |  |--05:EXCHANGE [BROADCAST]
> |  |  |
> |  |  00:SCAN HDFS [functional.alltypestiny a2]
> |  |     HDFS partitions=4/4 files=4 size=460B
> |  |     row-size=89B cardinality=8
> |  |
> |  01:SCAN HDFS [functional.alltypes a1]
> |     HDFS partitions=24/24 files=24 size=478.45KB
> |     runtime filters: RF002 -> a1.id
> |     row-size=89B cardinality=7.30K
> |
> 02:SCAN HDFS [functional.alltypessmall a3]
>    HDFS partitions=4/4 files=4 size=6.32KB
>    runtime filters: RF000 -> a3.tinyint_col
>    row-size=89B cardinality=100
> {code}
> if the first two tables' position is swapped, then it is applied:
> {code}
> explain select  * from alltypes a1 join /* +SHUFFLE */ alltypestiny a2 on 
> a1.id=a2.id join alltypessmall a3 on a2.tinyint_col=a3.tinyint_col;
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to