Csaba Ringhofer created IMPALA-13625:
----------------------------------------
Summary: Allow reading Parquet int32/int64 as Decimal without
Logical types
Key: IMPALA-13625
URL: https://issues.apache.org/jira/browse/IMPALA-13625
Project: IMPALA
Issue Type: Sub-task
Components: Backend
Reporter: Csaba Ringhofer
Currently Impala can only read int32/int64 as decimal from Parquet if the
scale/precision is provided by logical type in the Parquet metadata.
For schema evolution it could be useful to to read files that were written as
INT but the column was altered to a suitable DECIMAL.
For example:
{code}
create table t (i int) stored as parquet;
insert into t values (1);
alter table t change i i decimal(10, 0);
select * from t;
{code}
Curently this returns an error "2024-12-18 18:41:52 [Exception] ERROR: Query
...failed:
File... column 'i' does not have the decimal precision set."
The error comes from here:
https://github.com/apache/impala/blob/aefd1b0920150feff31922a6979affc005a6a7d4/be/src/exec/parquet/parquet-metadata-utils.cc#L373
ParquetMetadataUtils::ValidateColumn could be more tolerant and consider an
int32 as decimal(9,0) or an int64 as decimal(19,0)
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]