[
https://issues.apache.org/jira/browse/HIVE-28598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
yongzhi.shao updated HIVE-28598:
--------------------------------
Description:
Currently, we have found that in some scenarios, join operations using two
iceberg tables may result in NPEs.
INIT-SQL:
{code:java}
CREATE TABLE T1
(
ID STRING,
ID2 STRING,
ID3 STRING
)STORED BY ICEBERG STORED AS ORC;
CREATE TABLE T2
(
ID STRING,
ID2 STRING
)STORED BY ICEBERG STORED AS ORC;
CREATE TABLE T1_ORC
(
ID STRING,
ID2 STRING,
ID3 STRING
)STORED AS ORC;
CREATE TABLE T2_ORC
(
ID STRING,
ID2 STRING
)STORED AS ORC; {code}
1. When the bucket_version of the T1 table is different from that of the T2
table, running the SQL shown below will throw an error:
{code:java}
select count(1)
from
(select ID,ID2,ID3 from test.t1) t
left join
(select ID,ID2 from test.t2) t2
on t.ID = t2.ID and t.ID2 = t2.ID2; {code}
2.When the BUCKET_VERSION of the T1 and T2 tables are the same, problem 1
disappears, but the following SQL still throws an exception:
{code:java}
select count(1)
from
(select ID,ID2 from test.t1 WHERE ID3='NORMAL') t
left join
(select ID,ID2 from test.t2) t2
on t.ID = t2.ID and t.ID2 = t2.ID2; {code}
When I replace the T1 T2 table with the T1_ORC T2_ORC table, the SQL executes
fine.
was:Currently, we have found that in some scenarios, join operations using
two iceberg tables may result in NPEs.
> Join by using two iceberg table may cause NPE
> ---------------------------------------------
>
> Key: HIVE-28598
> URL: https://issues.apache.org/jira/browse/HIVE-28598
> Project: Hive
> Issue Type: Bug
> Security Level: Public(Viewable by anyone)
> Components: Iceberg integration, Query Processor
> Affects Versions: 4.0.1
> Reporter: yongzhi.shao
> Priority: Major
>
> Currently, we have found that in some scenarios, join operations using two
> iceberg tables may result in NPEs.
> INIT-SQL:
>
> {code:java}
> CREATE TABLE T1
> (
> ID STRING,
> ID2 STRING,
> ID3 STRING
> )STORED BY ICEBERG STORED AS ORC;
> CREATE TABLE T2
> (
> ID STRING,
> ID2 STRING
> )STORED BY ICEBERG STORED AS ORC;
> CREATE TABLE T1_ORC
> (
> ID STRING,
> ID2 STRING,
> ID3 STRING
> )STORED AS ORC;
> CREATE TABLE T2_ORC
> (
> ID STRING,
> ID2 STRING
> )STORED AS ORC; {code}
>
>
> 1. When the bucket_version of the T1 table is different from that of the T2
> table, running the SQL shown below will throw an error:
>
> {code:java}
> select count(1)
> from
> (select ID,ID2,ID3 from test.t1) t
> left join
> (select ID,ID2 from test.t2) t2
> on t.ID = t2.ID and t.ID2 = t2.ID2; {code}
> 2.When the BUCKET_VERSION of the T1 and T2 tables are the same, problem 1
> disappears, but the following SQL still throws an exception:
> {code:java}
> select count(1)
> from
> (select ID,ID2 from test.t1 WHERE ID3='NORMAL') t
> left join
> (select ID,ID2 from test.t2) t2
> on t.ID = t2.ID and t.ID2 = t2.ID2; {code}
> When I replace the T1 T2 table with the T1_ORC T2_ORC table, the SQL executes
> fine.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)