[jira] [Comment Edited] (IMPALA-11053) Impala should be able to read migrated partitioned Iceberg tables

LiPenglin (Jira) Thu, 09 Jun 2022 06:44:11 -0700


    [ 
https://issues.apache.org/jira/browse/IMPALA-11053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17552218#comment-17552218
 ]


LiPenglin edited comment on IMPALA-11053 at 6/9/22 1:43 PM:
------------------------------------------------------------

Thanks [~boroknagyz] I cleaned up my code and got the same results as you, 
sorry for the above mistake.
{code:java}
[localhost.localdomain:21050] default> select * from 
functional_parquet.iceberg_alltypes_part where p_bool=true;
ERROR: Unable to find SchemaNode for path 
'functional_parquet.iceberg_alltypes_part.p_bool' in the schema of file 
'hdfs://localhost:20500/test-warehouse/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part/p_bool=true/p_int=1/p_bigint=11/p_float=1.1/p_double=2.222/p_decimal=123.321/p_date=2022-02-22/p_string=impala/000000_0'.
 {code}
One of the things I recently did was migrate from a Hive table to a Iceberg 
table. I expected to ensure that the original Hive Partition Column would 
continue to be used in the WHERE clause after the migration. So, is there a 
solution to errors on partition column values?

UPDATE: I saw the IMPALA-11346, that is great!

 


was (Author: lipenglin):
Thanks [~boroknagyz] I cleaned up my code and got the same results as you, 
sorry for the above mistake.
{code:java}
[localhost.localdomain:21050] default> select * from 
functional_parquet.iceberg_alltypes_part where p_bool=true;
ERROR: Unable to find SchemaNode for path 
'functional_parquet.iceberg_alltypes_part.p_bool' in the schema of file 
'hdfs://localhost:20500/test-warehouse/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part/p_bool=true/p_int=1/p_bigint=11/p_float=1.1/p_double=2.222/p_decimal=123.321/p_date=2022-02-22/p_string=impala/000000_0'.
 {code}
One of the things I recently did was migrate from a Hive table to a Iceberg 
table. I expected to ensure that the original Hive Partition Column would 
continue to be used in the WHERE clause after the migration. So, is there a 
solution to errors on partition column values?

 

 

> Impala should be able to read migrated partitioned Iceberg tables
> -----------------------------------------------------------------
>
>                 Key: IMPALA-11053
>                 URL: https://issues.apache.org/jira/browse/IMPALA-11053
>             Project: IMPALA
>          Issue Type: Bug
>            Reporter: Zoltán Borók-Nagy
>            Assignee: Zoltán Borók-Nagy
>            Priority: Major
>              Labels: impala-iceberg
>             Fix For: Impala 4.1.0
>
>
> When Hive (and probably other engines as well) converts a legacy Hive table 
> to Iceberg it doesn't rewrite the data files.
> It means that the data files don't have write ids, moreover they don't have 
> the partition columns neither.
> Currently Impala expects tha partition columns to be present in the data 
> files, so it won't be able to read converted partitioned tables.
> So we need to inject partition values from the Iceberg metadata, plus resolve 
> columns correctly (position-based resolution needs an offset).



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (IMPALA-11053) Impala should be able to read migrated partitioned Iceberg tables

Reply via email to