[
https://issues.apache.org/jira/browse/HIVE-23712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
László Bodor updated HIVE-23712:
--------------------------------
Description:
Similarly to HIVE-15397, queries can return incorrect results for metadata-only
queries, here is a repro scenario which affects master:
{code}
set hive.support.concurrency=true;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
set hive.optimize.metadataonly=true;
create table test1 (id int, val string) partitioned by (val2 string) STORED AS
ORC TBLPROPERTIES ('transactional'='true');
describe formatted test1;
alter table test1 add partition (val2='foo');
alter table test1 add partition (val2='bar');
insert into test1 partition (val2='foo') values (1, 'abc');
select distinct val2, current_timestamp from test1;
insert into test1 partition (val2='bar') values (1, 'def');
delete from test1 where val2 = 'bar';
select '--> hive.optimize.metadataonly=true';
select distinct val2, current_timestamp from test1;
set hive.optimize.metadataonly=false;
select '--> hive.optimize.metadataonly=false';
select distinct val2, current_timestamp from test1;
select current_timestamp, * from test1;
{code}
in this case 2 rows returned instead of 1 after a delete with metadata only
optimization:
https://github.com/abstractdog/hive/commit/a7f03513564d01f7c3ba4aa61c4c6537100b4d3f#diff-cb23043000831f41fe7041cb38f82224R114-R128
was:
Similarly to HIVE-15397, queries can return incorrect results for metadata-only
queries, here is a repro scenario which affects master:
{code}
set hive.support.concurrency=true;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
set hive.optimize.metadataonly=true;
create table test1 (id int, val string) partitioned by (val2 string) STORED AS
ORC TBLPROPERTIES ('transactional'='true');
describe formatted test1;
alter table test1 add partition (val2='foo');
alter table test1 add partition (val2='bar');
insert into test1 partition (val2='foo') values (1, 'abc');
select distinct val2, current_timestamp from test1;
insert into test1 partition (val2='bar') values (1, 'def');
delete from test1 where val2 = 'bar';
select '--> hive.optimize.metadataonly=true';
select distinct val2, current_timestamp from test1;
set hive.optimize.metadataonly=false;
select '--> hive.optimize.metadataonly=false';
select distinct val2, current_timestamp from test1;
select current_timestamp, * from test1;
{code}
> metadata-only queries return incorrect results with empty partition
> -------------------------------------------------------------------
>
> Key: HIVE-23712
> URL: https://issues.apache.org/jira/browse/HIVE-23712
> Project: Hive
> Issue Type: Bug
> Reporter: László Bodor
> Assignee: László Bodor
> Priority: Major
>
> Similarly to HIVE-15397, queries can return incorrect results for
> metadata-only queries, here is a repro scenario which affects master:
> {code}
> set hive.support.concurrency=true;
> set hive.exec.dynamic.partition.mode=nonstrict;
> set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
> set hive.optimize.metadataonly=true;
> create table test1 (id int, val string) partitioned by (val2 string) STORED
> AS ORC TBLPROPERTIES ('transactional'='true');
> describe formatted test1;
> alter table test1 add partition (val2='foo');
> alter table test1 add partition (val2='bar');
> insert into test1 partition (val2='foo') values (1, 'abc');
> select distinct val2, current_timestamp from test1;
> insert into test1 partition (val2='bar') values (1, 'def');
> delete from test1 where val2 = 'bar';
> select '--> hive.optimize.metadataonly=true';
> select distinct val2, current_timestamp from test1;
> set hive.optimize.metadataonly=false;
> select '--> hive.optimize.metadataonly=false';
> select distinct val2, current_timestamp from test1;
> select current_timestamp, * from test1;
> {code}
> in this case 2 rows returned instead of 1 after a delete with metadata only
> optimization:
> https://github.com/abstractdog/hive/commit/a7f03513564d01f7c3ba4aa61c4c6537100b4d3f#diff-cb23043000831f41fe7041cb38f82224R114-R128
--
This message was sent by Atlassian Jira
(v8.3.4#803005)