[ 
https://issues.apache.org/jira/browse/IMPALA-13625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18008801#comment-18008801
 ] 

ASF subversion and git services commented on IMPALA-13625:
----------------------------------------------------------

Commit ee69ed1d0386f41689269a522e2aed490e52987d in impala's branch 
refs/heads/master from Daniel Vanko
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=ee69ed1d0 ]

IMPALA-13625: Allow reading Parquet int32/int64 as decimal without logical types

This patch allows reading columns with integer logical type as decimals.
This can occur when we're trying to read files that were written as INT but
the column was altered to a suitable DECIMAL. In this case the precision
is based on physical type and equals 9 and 18, for int32 and int64
respectively.

Test:
* add new e2e tests

Change-Id: I56006eb3cca28c81ec8467d77b35005fbf669680
Reviewed-on: http://gerrit.cloudera.org:8080/22922
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>


> Allow reading Parquet int32/int64 as Decimal without Logical types
> ------------------------------------------------------------------
>
>                 Key: IMPALA-13625
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13625
>             Project: IMPALA
>          Issue Type: Sub-task
>          Components: Backend
>            Reporter: Csaba Ringhofer
>            Assignee: Dániel Gábor Vankó
>            Priority: Major
>              Labels: ramp-up
>
> Currently Impala can only read int32/int64 as decimal from  Parquet if the 
> scale/precision is provided by logical type in the Parquet metadata.
> For schema evolution it could be useful to to read files that were written as 
> INT but the column was altered to a suitable DECIMAL.
> For example:
> {code}
> create table t (i int) stored as parquet;
> insert into t values (1);
> alter table t change i i decimal(10, 0);
> select * from t;
> {code}
> Curently this returns an error "2024-12-18 18:41:52 [Exception]  ERROR: Query 
> ...failed:
> File...  column 'i' does not have the decimal precision set."
> The error comes from here: 
> https://github.com/apache/impala/blob/aefd1b0920150feff31922a6979affc005a6a7d4/be/src/exec/parquet/parquet-metadata-utils.cc#L373
> ParquetMetadataUtils::ValidateColumn could be more tolerant and consider an 
> int32 as decimal(9,0) or an int64 as decimal(19,0)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to