[
https://issues.apache.org/jira/browse/HIVE-22039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rajesh Balamohan updated HIVE-22039:
------------------------------------
Description:
Here is a very simple repro for this case. Query would crash HS2. Temp
workaround is to turn off CBO for such corner cases.
It runs into a infinite loop creating too many number of RexCalls and finally
OOMs.
This is observed in 2.x, 3.x.
With 4.x (master branch), it does not happen. Master has
{{calcite-core-1.19.0.jar}}, where as 3.x has {{calcite-core-1.16.0.jar}}.
I haven't investigated yet on which specific bug in calcite is triggering this
(if I move hive-master to {{calcite-core-1.16.0.jar}}, this problem shows up in
4.x as well)
{noformat}
drop table if exists tableA;
drop table if exists tableB;
create table if not exists tableA(id int, reporting_date string) stored as orc;
create table if not exists tableB(id int, reporting_date string) partitioned by
(datestr string) stored as orc;
explain with tableA_cte as (
select
id,
reporting_date
from tableA
),
tableA_cte_2 as (
select
0 as id,
reporting_date
from tableA
),
tableA_cte_5 as (
select * from tableA_cte
union
select * from tableA_cte_2
),
tableB_cte_0 as (
select
id,
reporting_date
from tableB
where reporting_date = '2018-10-29'
),
tableB_cte_1 as (
select
0 as id,
reporting_date
from tableB
where datestr = '2018-10-29'
),
tableB_cte_4 as (
select * from tableB_cte_0
union
select * from tableB_cte_1
)
select
a.id as id,
b.reporting_date
from tableA_cte_5 a
join tableB_cte_4 b on (a.id = b.id and a.reporting_date = b.reporting_date);
{noformat}
was:
Here is a very simple repro for this case.
This along with CBO would crash HS2. It runs into a infinite loop creating too
many number of RexCalls and finally OOMs.
This is observed in 2.x, 3.x.
With 4.x (master branch), it does not happen. Master has
{{calcite-core-1.19.0.jar}}, where as 3.x has {{calcite-core-1.16.0.jar}}.
{noformat}
drop table if exists tableA;
drop table if exists tableB;
create table if not exists tableA(id int, reporting_date string) stored as orc;
create table if not exists tableB(id int, reporting_date string) partitioned by
(datestr string) stored as orc;
explain with tableA_cte as (
select
id,
reporting_date
from tableA
),
tableA_cte_2 as (
select
0 as id,
reporting_date
from tableA
),
tableA_cte_5 as (
select * from tableA_cte
union
select * from tableA_cte_2
),
tableB_cte_0 as (
select
id,
reporting_date
from tableB
where reporting_date = '2018-10-29'
),
tableB_cte_1 as (
select
0 as id,
reporting_date
from tableB
where datestr = '2018-10-29'
),
tableB_cte_4 as (
select * from tableB_cte_0
union
select * from tableB_cte_1
)
select
a.id as id,
b.reporting_date
from tableA_cte_5 a
join tableB_cte_4 b on (a.id = b.id and a.reporting_date = b.reporting_date);
{noformat}
> Query with CBO crashes HS2 in corner case
> -----------------------------------------
>
> Key: HIVE-22039
> URL: https://issues.apache.org/jira/browse/HIVE-22039
> Project: Hive
> Issue Type: Bug
> Components: CBO
> Affects Versions: 3.1.1, 2.3.4
> Reporter: Rajesh Balamohan
> Priority: Major
>
> Here is a very simple repro for this case. Query would crash HS2. Temp
> workaround is to turn off CBO for such corner cases.
> It runs into a infinite loop creating too many number of RexCalls and finally
> OOMs.
> This is observed in 2.x, 3.x.
> With 4.x (master branch), it does not happen. Master has
> {{calcite-core-1.19.0.jar}}, where as 3.x has {{calcite-core-1.16.0.jar}}.
> I haven't investigated yet on which specific bug in calcite is triggering
> this (if I move hive-master to {{calcite-core-1.16.0.jar}}, this problem
> shows up in 4.x as well)
>
> {noformat}
> drop table if exists tableA;
> drop table if exists tableB;
> create table if not exists tableA(id int, reporting_date string) stored as
> orc;
> create table if not exists tableB(id int, reporting_date string) partitioned
> by (datestr string) stored as orc;
> explain with tableA_cte as (
> select
> id,
> reporting_date
> from tableA
> ),
> tableA_cte_2 as (
> select
> 0 as id,
> reporting_date
> from tableA
> ),
> tableA_cte_5 as (
> select * from tableA_cte
> union
> select * from tableA_cte_2
> ),
> tableB_cte_0 as (
> select
> id,
> reporting_date
> from tableB
> where reporting_date = '2018-10-29'
> ),
> tableB_cte_1 as (
> select
> 0 as id,
> reporting_date
> from tableB
> where datestr = '2018-10-29'
> ),
> tableB_cte_4 as (
> select * from tableB_cte_0
> union
> select * from tableB_cte_1
> )
> select
> a.id as id,
> b.reporting_date
> from tableA_cte_5 a
> join tableB_cte_4 b on (a.id = b.id and a.reporting_date = b.reporting_date);
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)