[
https://issues.apache.org/jira/browse/HIVE-20382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16579293#comment-16579293
]
Jesus Camacho Rodriguez commented on HIVE-20382:
------------------------------------------------
Requires CALCITE-2465.
Cc [~ashutoshc]
> Materialized views: Introduce heuristic to favour incremental rebuild
> ---------------------------------------------------------------------
>
> Key: HIVE-20382
> URL: https://issues.apache.org/jira/browse/HIVE-20382
> Project: Hive
> Issue Type: Improvement
> Components: Materialized views
> Reporter: Jesus Camacho Rodriguez
> Assignee: Jesus Camacho Rodriguez
> Priority: Major
> Attachments: HIVE-20382.patch
>
>
> Currently, we do not expose stats over ROW\_\_ID.writeId to the optimizer
> (this should be fixed by HIVE-20313). Even if we did, we always assume
> uniform distribution of the column values, which can easily lead to
> overestimations on the number of rows read when we filter on
> ROW\_\_ID.writeId for materialized views (think about a large transaction for
> MV creation and then small ones for incremental maintenance). This
> overestimation can lead to incremental view maintenance not being triggered
> as cost of the incremental plan is overestimated (we think we will read more
> rows than we actually do). This could be fixed by introducing histograms that
> reflect better the column values distribution.
> Till both fixes are implemented, we will use a config variable that will
> multiply the estimated cost of the rebuild plan and hence will be able to
> favour incremental rebuild over full rebuild.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)