[ 
https://issues.apache.org/jira/browse/HIVE-20382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16580388#comment-16580388
 ] 

Ashutosh Chauhan commented on HIVE-20382:
-----------------------------------------

+1

> Materialized views: Introduce heuristic to favour incremental rebuild
> ---------------------------------------------------------------------
>
>                 Key: HIVE-20382
>                 URL: https://issues.apache.org/jira/browse/HIVE-20382
>             Project: Hive
>          Issue Type: Improvement
>          Components: Materialized views
>            Reporter: Jesus Camacho Rodriguez
>            Assignee: Jesus Camacho Rodriguez
>            Priority: Major
>         Attachments: HIVE-20382.patch, HIVE-20382.patch
>
>
> Currently, we do not expose stats over ROW\_\_ID.writeId to the optimizer 
> (this should be fixed by HIVE-20313). Even if we did, we always assume 
> uniform distribution of the column values, which can easily lead to 
> overestimations on the number of rows read when we filter on 
> ROW\_\_ID.writeId for materialized views (think about a large transaction for 
> MV creation and then small ones for incremental maintenance). This 
> overestimation can lead to incremental view maintenance not being triggered 
> as cost of the incremental plan is overestimated (we think we will read more 
> rows than we actually do). This could be fixed by introducing histograms that 
> reflect better the column values distribution.
> Till both fixes are implemented, we will use a config variable that will 
> multiply the estimated cost of the rebuild plan and hence will be able to 
> favour incremental rebuild over full rebuild.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to