Dániel Gábor Vankó created IMPALA-14290:
-------------------------------------------

             Summary: Iceberg partitioning should be case insensitive of column 
names
                 Key: IMPALA-14290
                 URL: https://issues.apache.org/jira/browse/IMPALA-14290
             Project: IMPALA
          Issue Type: Bug
            Reporter: Dániel Gábor Vankó


When creating or altering partitions of Iceberg tables, Impala only accepts 
column names if they are in lower case.

E.g. START_TIME will throw exception in the PARTITIONED BY SPEC clause. (Also 
with YEAR, MONTH, TRUNCATE and BUCKET transforms as well.)
{noformat}
[localhost:21050] default> CREATE TABLE tab ( START_TIME TIMESTAMP )
PARTITIONED BY SPEC (DAY(START_TIME))
STORED AS ICEBERG;

Query: CREATE TABLE tab ( START_TIME TIMESTAMP )
PARTITIONED BY SPEC (DAY(START_TIME))
STORED AS ICEBERG
2025-08-06 12:50:04 [Exception]  ERROR: Query b7450d257090b184:8cb6c76000000000 
failed:
ImpalaRuntimeException: Error making 'createTable' RPC to Hive Metastore:
CAUSED BY: IllegalArgumentException: Cannot find source column: START_TIME
{noformat}
The table is successfully created if start_time is lowercase in the PARTITIONED 
BY SPEC, but still uppercase in the column definition list:
{noformat}
[localhost:21050] default> CREATE TABLE tab ( START_TIME TIMESTAMP )
PARTITIONED BY SPEC (DAY(start_time))
STORED AS ICEBERG;

Query: CREATE TABLE tab2 ( START_TIME TIMESTAMP )
PARTITIONED BY SPEC (DAY(start_time))
STORED AS ICEBERG
+-------------------------+
| summary                 |
+-------------------------+
| Table has been created. |
+-------------------------+
Fetched 1 row(s) in 0.21s{noformat}
Moreover, altering partition on an existing table also throws exception when 
column name is not lowercase:
{noformat}
[localhost:21050] default> alter table tab2 set partition spec 
(month(START_TIME));
Query: alter table tab2 set partition spec (month(START_TIME))
2025-08-06 13:22:58 [Exception]  ERROR: Query bd439760e55e7cd2:9211459000000000 
failed:
ImpalaRuntimeException: Failed to ALTER table 'tab2': Cannot find field 
'START_TIME' in struct: struct<1: start_time: optional timestamp>{noformat}
 

During parsing, column name is set to lowercase: 
[https://github.com/apache/impala/blob/09a6f0e6cd912f573f0d8950abf40f498385c628/fe/src/main/java/org/apache/impala/analysis/ColumnDef.java#L109]

In contrast, a partition field name is processed as-is: 
https://github.com/apache/impala/blob/09a6f0e6cd912f573f0d8950abf40f498385c628/fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java#L52-L61



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to