[
https://issues.apache.org/jira/browse/HIVE-24854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Krisztian Kasa resolved HIVE-24854.
-----------------------------------
Resolution: Fixed
Pushed to master. Thanks [~jcamachorodriguez] for review.
> Incremental Materialized view refresh in presence of update/delete operations
> -----------------------------------------------------------------------------
>
> Key: HIVE-24854
> URL: https://issues.apache.org/jira/browse/HIVE-24854
> Project: Hive
> Issue Type: Improvement
> Reporter: Krisztian Kasa
> Assignee: Krisztian Kasa
> Priority: Major
> Labels: pull-request-available
> Time Spent: 2h 20m
> Remaining Estimate: 0h
>
> Current implementation of incremental Materialized can not be used if any of
> the Materialized view source tables has update or delete operation since the
> last rebuild. In such cases a full rebuild should be performed.
> Steps to enable incremental rebuild:
> 1. Introduce a new virtual column to mark a row deleted
> 2. Execute the query in the view definition
> 2.a. Add filter to each table scan in order to pull only the rows from each
> source table which has a higher writeId than the writeId of the last rebuild
> - this is already implemented by current incremental rebuild
> 2.b Add row is deleted virtual column to each table scan. In join nodes if
> any of the branches has a deleted row the result row is also deleted.
> We should distinguish two type of view definition queries: with and without
> Aggregate.
> 3.a No aggregate path:
> Rewrite the plan of the full rebuild to a multi insert statement with two
> insert branches. One branch to insert new rows into the materialized view
> table and the second one for insert deleted rows to the materialized view
> delete delta.
> 3.b Aggregate path: TBD
> Prerequisite:
> source tables haven't compacted since the last MV revuild
--
This message was sent by Atlassian Jira
(v8.3.4#803005)