[
https://issues.apache.org/jira/browse/HIVE-26498?focusedWorklogId=807393&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-807393
]
ASF GitHub Bot logged work on HIVE-26498:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 09/Sep/22 13:02
Start Date: 09/Sep/22 13:02
Worklog Time Spent: 10m
Work Description: kasakrisz commented on code in PR #3552:
URL: https://github.com/apache/hive/pull/3552#discussion_r967046476
##########
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/views/HiveMaterializedViewUtils.java:
##########
@@ -160,11 +181,79 @@ public static Boolean isOutdatedMaterializedView(
return false;
}
+ private static Boolean isOutdatedMaterializedView(
+ MaterializationSnapshot snapshot, Hive db,
+ Set<TableName> tablesUsed, Table materializedViewTable) throws
HiveException {
+ List<String> tablesUsedNames = tablesUsed.stream()
+ .map(tableName -> TableName.getDbTable(tableName.getDb(),
tableName.getTable()))
+ .collect(Collectors.toList());
+
+ Map<String, String> snapshotMap = snapshot.getTableSnapshots();
+ if (snapshotMap == null || snapshotMap.isEmpty()) {
+ LOG.debug("Materialized view " +
materializedViewTable.getFullyQualifiedName() +
+ " ignored for rewriting as we could not obtain current snapshot
ids");
+ return null;
+ }
+
+ Set<String> storedTablesUsed =
materializedViewTable.getMVMetadata().getSourceTableFullNames();
+ for (String fullyQualifiedTableName : tablesUsedNames) {
+ // Note. If the materialized view does not contain a table that is
contained in the query,
+ // we do not need to check whether that specific table is outdated or
not. If a rewriting
+ // is produced in those cases, it is because that additional table is
joined with the
+ // existing tables with an append-columns only join, i.e., PK-FK + not
null.
+ if (!storedTablesUsed.contains(fullyQualifiedTableName)) {
+ continue;
+ }
+
+ Table table = db.getTable(fullyQualifiedTableName);
+ if (table.getStorageHandler() == null) {
+ LOG.debug("Materialized view {} ignored for rewriting as we could not
storage handler of table {}",
+ materializedViewTable.getFullyQualifiedName(),
fullyQualifiedTableName);
+ return null;
+ }
+ String currentTableSnapshot =
table.getStorageHandler().getCurrentSnapshotId(table);
+ if (isBlank(currentTableSnapshot)) {
Review Comment:
Refactored this API and it's usage:
* Renamed `getCurrentSnapshotId` to `getCurrentSnapshotContext` and return a
`SnapshotContext` object wraps the `long snapshotId` instead of `String`
* `getCurrentSnapshotContext` default implementation returns `null` and the
Iceberg implementation can also return `null` if the table is empty so no
current snapshot exists.
* introduced `boolean areSnapshotsSupported` API method to distinguish
between empty table and snapshots are not supported by the storage handler.
* Adjusted the usage at client side when checking MV is up-to-date.
Issue Time Tracking
-------------------
Worklog Id: (was: 807393)
Time Spent: 4h 50m (was: 4h 40m)
> Implement MV maintenance with Iceberg sources using full rebuild
> ----------------------------------------------------------------
>
> Key: HIVE-26498
> URL: https://issues.apache.org/jira/browse/HIVE-26498
> Project: Hive
> Issue Type: Sub-task
> Components: Materialized views
> Reporter: Krisztian Kasa
> Assignee: Krisztian Kasa
> Priority: Major
> Labels: pull-request-available
> Time Spent: 4h 50m
> Remaining Estimate: 0h
>
> {code}
> set hive.support.concurrency=true;
> set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
> create external table tbl_ice(a int, b string, c int) stored by iceberg
> stored as orc tblproperties ('format-version'='2');
> insert into tbl_ice values (1, 'one', 50), (2, 'two', 51), (3, 'three', 52),
> (4, 'four', 53), (5, 'five', 54);
> create materialized view mat1 as
> select b, c from tbl_ice where c > 52;
> insert into tbl_ice values (111, 'one', 55), (333, 'two', 56);
> explain cbo
> alter materialized view mat1 rebuild;
> alter materialized view mat1 rebuild;
> {code}
> MV full rebuild plan
> {code}
> CBO PLAN:
> HiveProject(b=[$1], c=[$2])
> HiveFilter(condition=[>($2, 52)])
> HiveTableScan(table=[[default, tbl_ice]], table:alias=[tbl_ice])
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)