[
https://issues.apache.org/jira/browse/HIVE-27948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Krisztian Kasa reassigned HIVE-27948:
-------------------------------------
Assignee: Krisztian Kasa
> Wrong results when using materialized views with non-deterministic/dynamic
> functions
> ------------------------------------------------------------------------------------
>
> Key: HIVE-27948
> URL: https://issues.apache.org/jira/browse/HIVE-27948
> Project: Hive
> Issue Type: Bug
> Components: CBO
> Affects Versions: 4.0.0-beta-1
> Reporter: Stamatis Zampetakis
> Assignee: Krisztian Kasa
> Priority: Critical
> Attachments: materialized_view_unix_timestamp.q
>
>
> There are certain SQL functions that return different results across
> different executions. Usually we refer to these functions as
> non-deterministic or dynamic. Some examples are: UNIX_TIMESTAMP(),
> CURRENT_TIMESTAMP, CURRENT_DATE, etc.
> When a materialized view definition contains such functions the queries that
> are using this view may return wrong results.
> Consider the following scenario where we populate the employee table with
> timestamps representing the future. For making this easily reproable in
> self-contained test the timestamps are only a few seconds apart.
> {code:sql}
> CREATE TABLE EMPS (ENAME STRING, BIRTH_EPOCH_SECS INT) STORED AS ORC
> TBLPROPERTIES ('transactional'='true');
> INSERT INTO EMPS
> VALUES ('Victor', UNIX_TIMESTAMP()),
> ('Alex', UNIX_TIMESTAMP() + 2),
> ('Bob', UNIX_TIMESTAMP() + 5),
> ('Alice', UNIX_TIMESTAMP() + 10);
> CREATE MATERIALIZED VIEW v_emp AS SELECT * FROM EMPS WHERE BIRTH_EPOCH_SECS
> <= UNIX_TIMESTAMP();
> {code}
> When the materialized view is created it is populated with only the rows that
> match the timestamp at the given time.
> To demonstrate the problem run the following queries with view based
> rewritting disabled and enabled.
> {code:sql}
> set hive.materializedview.rewriting.sql=false;
> SELECT * FROM EMPS WHERE BIRTH_EPOCH_SECS <= UNIX_TIMESTAMP();
> {code}
> {noformat}
> Victor 1702302786
> Alex 1702302788
> Bob 1702302791
> {noformat}
> {code:sql}
> set hive.materializedview.rewriting.sql=true;
> SELECT * FROM EMPS WHERE BIRTH_EPOCH_SECS <= UNIX_TIMESTAMP();
> {code}
> {noformat}
> Victor 1702302786
> Alex 1702302788
> {noformat}
> Naturally the second query should return more rows than the first one since
> UNIX_TIMESTAMP is constantly growing. However, when view based rewritting is
> in use the second query will use the results from the materialized view which
> are by now obsolete (missing Bob entry).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)