[ 
https://issues.apache.org/jira/browse/HIVE-19108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16447531#comment-16447531
 ] 

Jerry Chen commented on HIVE-19108:
-----------------------------------

I debugged the case and here is the root cause of the problem. The problem is 
caused by inconsistent handling of double and float at the Hive vectorization 
engine (Not related to Parquet vectorization reader). Take the following query 
for example:

create table newtypestbl(c char(10), v varchar(10), d decimal(5,3), da date) 
stored as parquet;

insert overwrite table newtypestbl select * from (select cast("apple" as 
char(10)), cast("bee" as varchar(10)), 0.22, cast("1970-02-20" as date) from 
src src1 union all select cast("hello" as char(10)), cast("world" as 
varchar(10)), 11.22, cast("1970-02-27" as date) from src src2 limit 10) 
uniontbl;

set hive.optimize.index.filter=false;
select * from newtypestbl where d=cast('0.22' as float);

At the compile time, the literal "0.22" was converted to float value which 
leads to the value "0.2199999988079071" due the float precision.  While the 
vectorization engine uses double related expression/functions to evaluate the 
remaining things including filtering (FilterDoubleColEqualDoubleScalar) and 
decimal to double conversion (CastDecimalToDouble), which keep the decimal 
"0.22" precision in double. When comparing with 0.22 (double) with 
"0.2199999988079071" (float), the predicate fails and return an empty result 
set for vectorization case.

> Vectorization and Parquet: Turning on vectorization in parquet_ppd_decimal.q 
> causes Wrong Query Results
> -------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-19108
>                 URL: https://issues.apache.org/jira/browse/HIVE-19108
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>    Affects Versions: 3.0.0
>            Reporter: Matt McCline
>            Priority: Critical
>
> Found in vectorization enable by default experiment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to