[ 
https://issues.apache.org/jira/browse/HIVE-15782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15848830#comment-15848830
 ] 

Aihua Xu commented on HIVE-15782:
---------------------------------

patch-1: decimal, date and timestamp currently are supported for filtering for 
parquet. Currently if there is such type in the filter expression, such 
subexpression with that type is incorrectly ignored. 

With the patch, if we can't convert search argument into filter expression, 
then no filtering will be applied on parquet files.

> query on parquet table returns incorrect result when 
> hive.optimize.index.filter is set to true 
> -----------------------------------------------------------------------------------------------
>
>                 Key: HIVE-15782
>                 URL: https://issues.apache.org/jira/browse/HIVE-15782
>             Project: Hive
>          Issue Type: Bug
>          Components: File Formats
>    Affects Versions: 2.2.0
>            Reporter: Aihua Xu
>            Assignee: Aihua Xu
>         Attachments: HIVE-15782.1.patch
>
>
> When hive.optimize.index.filter is set to true, the parquet table is filtered 
> using the parquet column index. 
> {noformat}
> set hive.optimize.index.filter=true;
> CREATE TABLE t1 (
>   name string,
>   dec decimal(5,0)
> ) stored as parquet;
> insert into table t1 values('Jim', 3);
> insert into table t1 values('Tom', 5);
> select * from t1 where (name = 'Jim' or dec = 5);
> {noformat}
> Only one row {{Jim, 3}} is returned, but both should be returned. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to