[
https://issues.apache.org/jira/browse/HIVE-28705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17921721#comment-17921721
]
Indhumathi Muthumurugesh edited comment on HIVE-28705 at 1/28/25 12:41 PM:
---------------------------------------------------------------------------
Thanks [~okumin] for checking this. Please find the analysis on this.
Hive supports handling *is not? true/false* predicates with HIVE-13583 which
is why the filter predicate is like below.
+{color:#00875a}(month = '2023-07') is not true{color}+
When CBO is enabled, *HivePartitionPruneRule* is applied which optimises
performance by pruning irrelevant partitions using filter conditions from HMS
side.
org.apache.hadoop.hive.metastore.{*}PartFilterExprUtil{*} handles most of the
filter expressions to prune partitions based on filter condition. If any of the
expression can't be converted to filter, then in that case, expression will be
null and all the partitions will be fetched from the backend db. In this
particular case, the expression "{+}{color:#00875a}(month = '2023-07') is not
true{color}{+}" is converted to "{+}{color:#00875a}(month = '2023-07') is
true{color}{+}" {color:#172b4d}because{color} of absence of *is not?* filter
handling in *PartitionFilter.g4* leading to incorrect results.
was (Author: indhumathi27):
Thanks [~okumin] for checking this. Please find the analysis on this.
Hive supports handling *is not? true/false* predicates with HIVE-13583 which
is why the filter predicate is like below.
(month = '2023-07') is not true
When CBO is enabled, *HivePartitionPruneRule* is applied which optimises
performance by pruning irrelevant partitions using filter conditions from HMS
side.
org.apache.hadoop.hive.metastore.{*}PartFilterExprUtil{*} handles most of the
filter expressions to prune partitions based on filter condition. If any of the
expression can't be converted to filter, then in that case, expression will be
null and all the partitions will be fetched from the backend db. In this
particular case, the expression "(month = '2023-07') is not true" is converted
to "(month = '2023-07') is true" {color:#172b4d}because{color} of absence of
*is not?* filter handling in *PartitionFilter.g4* leading to incorrect results.
> Data Inconsistency due to missing 'IS NOT? TRUE/FALSE' parsing in Partition
> Filter Pruning
> ------------------------------------------------------------------------------------------
>
> Key: HIVE-28705
> URL: https://issues.apache.org/jira/browse/HIVE-28705
> Project: Hive
> Issue Type: Bug
> Reporter: chiranjeevi
> Assignee: Indhumathi Muthumurugesh
> Priority: Major
> Labels: pull-request-available
>
> create table t1(id string) partitioned by (month string);
> insert into t1 select '1','2020-12';
> set hive.cbo.enable=false;
> select id from t1 where (case when month='2023-07' then false else true END);
> -- output: 1
> set hive.cbo.enable=true;
> select id from t1 where (case when month='2023-07' then false else true END);
> -- ouput: \{no result}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)