Csaba Ringhofer created IMPALA-13625:
----------------------------------------

             Summary: Allow reading Parquet int32/int64 as Decimal without 
Logical types
                 Key: IMPALA-13625
                 URL: https://issues.apache.org/jira/browse/IMPALA-13625
             Project: IMPALA
          Issue Type: Sub-task
          Components: Backend
            Reporter: Csaba Ringhofer


Currently Impala can only read int32/int64 as decimal from  Parquet if the 
scale/precision is provided by logical type in the Parquet metadata.
For schema evolution it could be useful to to read files that were written as 
INT but the column was altered to a suitable DECIMAL.

For example:
{code}
create table t (i int) stored as parquet;
insert into t values (1);
alter table t change i i decimal(10, 0);
select * from t;
{code}

Curently this returns an error "2024-12-18 18:41:52 [Exception]  ERROR: Query 
...failed:
File...  column 'i' does not have the decimal precision set."

The error comes from here: 
https://github.com/apache/impala/blob/aefd1b0920150feff31922a6979affc005a6a7d4/be/src/exec/parquet/parquet-metadata-utils.cc#L373

ParquetMetadataUtils::ValidateColumn could be more tolerant and consider an 
int32 as decimal(9,0) or an int64 as decimal(19,0)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to