[
https://issues.apache.org/jira/browse/IMPALA-8110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wenzhe Zhou resolved IMPALA-8110.
---------------------------------
Resolution: Fixed
> Parquet stat filtering does not handle narrowed int types correctly
> -------------------------------------------------------------------
>
> Key: IMPALA-8110
> URL: https://issues.apache.org/jira/browse/IMPALA-8110
> Project: IMPALA
> Issue Type: Improvement
> Components: Backend
> Reporter: Csaba Ringhofer
> Assignee: Wenzhe Zhou
> Priority: Critical
> Labels: correctness, parquet
>
> Impala can read int32 Parquet columns as tiny/smallint SQL columns. If the
> value does not fit into the 8/16 bit signed int's range, the value will
> overflow, e.g writing 128 as int32 and then rereading it as int8 will return
> -128. This is normal as far as I understand, but min/max stat filtering does
> not handle this case correctly:
> create table tnarrow (i int) stored as parquet;
> insert into tnarrow values (1), (201);
> alter table tnarrow change column i i tinyint;
> set PARQUET_READ_STATISTICS=0;
> select * from tnarrow where i < 0;
> -> returns 1 row: -56
> set PARQUET_READ_STATISTICS=1;
> -> returns 0 row
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]