[
https://issues.apache.org/jira/browse/HIVE-21950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Murshid Chalaev updated HIVE-21950:
-----------------------------------
Description:
Queries with SUBSTR function in EXISTS clause have much more complicated
execution plan in Hive-2.3 with enabled CBO then in Hive-2.3 with disabled CBO.
A query below has 8 stages which submit 4 MR jobs in Hive-2.3 with enabled CBO,
while with disabled CBO it has 4 stages and submits 1 MR job.
*STEPS TO REPRODUCE:*
{code:java}
CREATE TABLE i1122 (id STRING);
INSERT INTO i1122 VALUES (1),(1001);
EXPLAIN
SELECT *
FROM i1122 AS t1
WHERE EXISTS (
SELECT 1
FROM i1122 AS t2
WHERE t2.id = substr(t1.id,4)
);{code}
*ACTUAL RESULT:*
Explain plan in Hive-2.3 with disabled CBO(The same execution plan was in
Hive-1.2 with enabled CBO):
{code:java}
hive> SET hive.cbo.enable=false;
hive> EXPLAIN
> SELECT *
> FROM i1122 AS t1
> WHERE EXISTS (
> SELECT 1
> FROM i1122 AS t2
> WHERE t2.id = substr(t1.id,4)
> );
OK
STAGE DEPENDENCIES:
Stage-4 is a root stage
Stage-3 depends on stages: Stage-4
Stage-0 depends on stages: Stage-3
STAGE PLANS:
...
{code}
Explain plan in Hive-2.3 with enabled CBO:
{code:java}
hive> SET hive.cbo.enable=true;
hive> EXPLAIN
> SELECT *
> FROM i1122 AS t1
> WHERE EXISTS (
> SELECT 1
> FROM i1122 AS t2
> WHERE t2.id = substr(t1.id,4)
> );
OK
STAGE DEPENDENCIES:
Stage-3 is a root stage
Stage-8 depends on stages: Stage-3
Stage-6 depends on stages: Stage-8
Stage-5 depends on stages: Stage-6 , consists of Stage-7, Stage-1
Stage-7 has a backup stage: Stage-1
Stage-4 depends on stages: Stage-7
Stage-1
Stage-0 depends on stages: Stage-4, Stage-1
STAGE PLANS:
...
{code}
Full explain plans you can find in attachment
was:
Queries with SUBSTR function in EXISTS clause have much more complicated
execution plan in Hive-2.3 then it was in Hive-1.2. A query below has 8 stages
which submit 4 MR jobs in Hive-2.3, while in Hive-1.2 it has 4 stages and
submits 1 MR job. Without SUBSTR function or with disabled CBO in Hive-2.3
execution plan is the same as in Hive-1.2 with enabled CBO.
*STEPS TO REPRODUCE:*
{code:java}
CREATE TABLE i1122 (id STRING);
INSERT INTO i1122 VALUES (1),(1001);
EXPLAIN
SELECT *
FROM i1122 AS t1
WHERE EXISTS (
SELECT 1
FROM i1122 AS t2
WHERE t2.id = substr(t1.id,4)
);{code}
> Optimizer complicates execution plan of queries with SUBSTR function in
> EXISTS clause
> -------------------------------------------------------------------------------------
>
> Key: HIVE-21950
> URL: https://issues.apache.org/jira/browse/HIVE-21950
> Project: Hive
> Issue Type: Bug
> Affects Versions: 2.3.0, 3.1.1
> Reporter: Murshid Chalaev
> Priority: Minor
>
> Queries with SUBSTR function in EXISTS clause have much more complicated
> execution plan in Hive-2.3 with enabled CBO then in Hive-2.3 with disabled
> CBO. A query below has 8 stages which submit 4 MR jobs in Hive-2.3 with
> enabled CBO, while with disabled CBO it has 4 stages and submits 1 MR job.
> *STEPS TO REPRODUCE:*
> {code:java}
> CREATE TABLE i1122 (id STRING);
> INSERT INTO i1122 VALUES (1),(1001);
> EXPLAIN
> SELECT *
> FROM i1122 AS t1
> WHERE EXISTS (
> SELECT 1
> FROM i1122 AS t2
> WHERE t2.id = substr(t1.id,4)
> );{code}
> *ACTUAL RESULT:*
> Explain plan in Hive-2.3 with disabled CBO(The same execution plan was in
> Hive-1.2 with enabled CBO):
> {code:java}
> hive> SET hive.cbo.enable=false;
> hive> EXPLAIN
> > SELECT *
> > FROM i1122 AS t1
> > WHERE EXISTS (
> > SELECT 1
> > FROM i1122 AS t2
> > WHERE t2.id = substr(t1.id,4)
> > );
> OK
> STAGE DEPENDENCIES:
> Stage-4 is a root stage
> Stage-3 depends on stages: Stage-4
> Stage-0 depends on stages: Stage-3
> STAGE PLANS:
> ...
> {code}
> Explain plan in Hive-2.3 with enabled CBO:
> {code:java}
> hive> SET hive.cbo.enable=true;
> hive> EXPLAIN
> > SELECT *
> > FROM i1122 AS t1
> > WHERE EXISTS (
> > SELECT 1
> > FROM i1122 AS t2
> > WHERE t2.id = substr(t1.id,4)
> > );
> OK
> STAGE DEPENDENCIES:
> Stage-3 is a root stage
> Stage-8 depends on stages: Stage-3
> Stage-6 depends on stages: Stage-8
> Stage-5 depends on stages: Stage-6 , consists of Stage-7, Stage-1
> Stage-7 has a backup stage: Stage-1
> Stage-4 depends on stages: Stage-7
> Stage-1
> Stage-0 depends on stages: Stage-4, Stage-1
> STAGE PLANS:
> ...
> {code}
> Full explain plans you can find in attachment
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)