[
https://issues.apache.org/jira/browse/HIVE-27948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17795421#comment-17795421
]
Stamatis Zampetakis commented on HIVE-27948:
--------------------------------------------
It seems that before HIVE-24274, it was not possible to create materialized
views with non-deterministic/dynamic functions. I am not sure why this
requirement was relaxed with the feature added by HIVE-24274. [~kkasa] any
ideas why we changed the SemanticException to a plain console/log message in
[SemanticAnalyzer|https://github.com/apache/hive/blob/0e9282b3ed7c59dee3e3a950bd3cd390cca2652b/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java#L14668]?
> Wrong results when using materialized views with non-deterministic/dynamic
> functions
> ------------------------------------------------------------------------------------
>
> Key: HIVE-27948
> URL: https://issues.apache.org/jira/browse/HIVE-27948
> Project: Hive
> Issue Type: Bug
> Components: CBO
> Affects Versions: 4.0.0-beta-1
> Reporter: Stamatis Zampetakis
> Priority: Major
>
> There are certain SQL functions that return different results across
> different executions. Usually we refer to these functions as
> non-deterministic or dynamic. Some examples are: UNIX_TIMESTAMP(),
> CURRENT_TIMESTAMP, CURRENT_DATE, etc.
> When a materialized view definition contains such functions the queries that
> are using this view may return wrong results.
> Consider the following scenario where we populate the employee table with
> timestamps representing the future. For making this easily reproable in
> self-contained test the timestamps are only a few seconds apart.
> {code:sql}
> CREATE TABLE EMPS (ENAME STRING, BIRTH_EPOCH_SECS INT) STORED AS ORC
> TBLPROPERTIES ('transactional'='true');
> INSERT INTO EMPS
> VALUES ('Victor', UNIX_TIMESTAMP()),
> ('Alex', UNIX_TIMESTAMP() + 2),
> ('Bob', UNIX_TIMESTAMP() + 5),
> ('Alice', UNIX_TIMESTAMP() + 10);
> CREATE MATERIALIZED VIEW v_emp AS SELECT * FROM EMPS WHERE BIRTH_EPOCH_SECS
> <= UNIX_TIMESTAMP();
> {code}
> When the materialized view is created it is populated with only the rows that
> match the timestamp at the given time.
> To demonstrate the problem run the following queries with view based
> rewritting disabled and enabled.
> {code:sql}
> set hive.materializedview.rewriting.sql=false;
> SELECT * FROM EMPS WHERE BIRTH_EPOCH_SECS <= UNIX_TIMESTAMP();
> {code}
> {noformat}
> Victor 1702302786
> Alex 1702302788
> Bob 1702302791
> {noformat}
> {code:sql}
> set hive.materializedview.rewriting.sql=true;
> SELECT * FROM EMPS WHERE BIRTH_EPOCH_SECS <= UNIX_TIMESTAMP();
> {code}
> {noformat}
> Victor 1702302786
> Alex 1702302788
> {noformat}
> Naturally the second query should return more rows than the first one since
> UNIX_TIMESTAMP is constantly growing. However, when view based rewritting is
> in use the second query will use the results from the materialized view which
> are by now obsolete (missing Bob entry).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)