[
https://issues.apache.org/jira/browse/HIVE-28050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Krisztian Kasa resolved HIVE-28050.
-----------------------------------
Resolution: Fixed
Merged to master. Thanks [~amansinha100] and [~zabetak] for review.
> Disable Incremental non aggregated materialized view rebuild in presence of
> delete operations
> ---------------------------------------------------------------------------------------------
>
> Key: HIVE-28050
> URL: https://issues.apache.org/jira/browse/HIVE-28050
> Project: Hive
> Issue Type: Bug
> Components: Materialized views
> Reporter: Krisztian Kasa
> Assignee: Krisztian Kasa
> Priority: Major
> Labels: pull-request-available
> Fix For: 4.1.0
>
>
> To support incremental rebuild of materialized views which definition does
> not have aggregate in presence of delete operations in any of its source
> tables the records of the source tables need to be uniquely identified and
> right joined with the records present in the view.
> The join keys should be the unique columns of each table in the definition
> query but we can not determine which are those columns.
> One possibility is to project ROW_IDs of each source table in the view
> definition but the writeId component is changing at delete.
> Another way is to project columns of primary keys or unique keys but these
> constraints are not enforced in Hive.
> Current implementation leads to data correctness issues:
> {code}
> set hive.support.concurrency=true;
> set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
> create table cmv_basetable_n6 (a int, b varchar(256), c decimal(10,2), d int)
> stored as orc TBLPROPERTIES ('transactional'='true');
> insert into cmv_basetable_n6 values
> (1, 'bob', 10.30, 2),
> (1, 'alfred', 10.30, 2),
> (2, 'bob', 3.14, 3),
> (2, 'bonnie', 172342.2, 3),
> (3, 'calvin', 978.76, 3),
> (3, 'charlie', 9.8, 1);
> create table cmv_basetable_2_n3 (a int, b varchar(256), c decimal(10,2), d
> int) stored as orc TBLPROPERTIES ('transactional'='true');
> insert into cmv_basetable_2_n3 values
> (1, 'alfred', 10.30, 2),
> (3, 'calvin', 978.76, 3);
> CREATE MATERIALIZED VIEW cmv_mat_view_n6
> TBLPROPERTIES ('transactional'='true') AS
> SELECT cmv_basetable_n6.a, cmv_basetable_2_n3.c
> FROM cmv_basetable_n6 JOIN cmv_basetable_2_n3 ON (cmv_basetable_n6.a =
> cmv_basetable_2_n3.a)
> WHERE cmv_basetable_2_n3.c > 10.0;
> delete from cmv_basetable_n6 where b = 'bob';
> explain cbo
> alter materialized view cmv_mat_view_n6 rebuild;
> alter materialized view cmv_mat_view_n6 rebuild;
> select * from cmv_mat_view_n6;
> {code}
> {code}
> 3 978.76
> 3 978.76
> {code}
> but it should be
> {code}
> 1 10.30
> 3 978.76
> 3 978.76
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)