Zhihua Deng created HIVE-23893: ---------------------------------- Summary: Extract deterministic conditions for pdd when the predicate contains non-deterministic function Key: HIVE-23893 URL: https://issues.apache.org/jira/browse/HIVE-23893 Project: Hive Issue Type: Improvement Components: Logical Optimizer Reporter: Zhihua Deng
Taken the following query for example, assume unix_timestamp is non-deterministic before version 1.3.0: {{SELECT}} {{ from_unixtime(unix_timestamp(a.first_dt), 'yyyyMMdd') AS ft,}} {{ b.game_id AS game_id,}} {{ b.game_name AS game_name,}} {{ count(DISTINCT a.sha1_imei) uv}} {{FROM}} {{ gamesdk_userprofile a}} {{ JOIN game_info_all b ON a.appid = b.dev_app_id}} {{WHERE}} {{ a.date = 20200704}} {{ AND from_unixtime(unix_timestamp(a.first_dt), 'yyyyMMdd') = 20200704}} {{ AND b.date = 20200704}} {{GROUP BY}} {{ from_unixtime(unix_timestamp(a.first_dt), 'yyyyMMdd'),}} {{ b.game_id,}} {{ b.game_name}} {{ORDER BY}} {{ uv DESC}} {{LIMIT 200;}} The predicates(a.date = 20200704, b.date = 20200704) are unable to push down to join op, make the optimizer unable to prune partitions, which may result to a full scan on tables gamesdk_userprofile and game_info_all. {{}} -- This message was sent by Atlassian Jira (v8.3.4#803005)