[ 
https://issues.apache.org/jira/browse/IMPALA-12987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Csaba Ringhofer updated IMPALA-12987:
-------------------------------------
    Description: 
Inserting strings with "\0" values to partition columns leads errors both in 
Iceberg and Hive tables. 

The issue is more severe in Iceberg tables as from this point the table can't 
be read in Impala or Hive:
{code}
create table iceberg_unicode (s string, p string) partitioned by spec 
(identity(p)) stored as iceberg;
insert into iceberg_unicode select "a", "a\0a";
ERROR: IcebergTableLoadingException: Error loading metadata for Iceberg table 
hdfs://localhost:20500/test-warehouse/iceberg_unicode
CAUSED BY: TableLoadingException: Refreshing file and block metadata for 1 
paths for table default.iceberg_unicode: failed to load 1 paths. Check the 
catalog server log for more details.
{code}

The partition directory created above seems truncated:
hdfs://localhost:20500/test-warehouse/iceberg_unicode/data/p=a

In partition Hive tables the insert also returns an error, but the new 
partition is not created and the table remains usable. The error is similar to 
IMPALA-11499's

Note Java handles  \0 characters in unicode in a special way, which may be 
related: 
https://docs.oracle.com/javase/1.5.0/docs/guide/jni/spec/types.html#wp16542


  was:
Inserting strings with "\0" values to partition columns leads errors both in 
Iceberg and Hive tables. 

The issue is more severe in Iceberg tables as from this point the table can't 
be read in Impala or Hive:
{code}
create table iceberg_unicode (s string, p string) partitioned by spec 
(identity(p)) stored as iceberg;
insert into iceberg_unicode select "a", "a\0a";
ERROR: IcebergTableLoadingException: Error loading metadata for Iceberg table 
hdfs://localhost:20500/test-warehouse/iceberg_unicode
CAUSED BY: TableLoadingException: Refreshing file and block metadata for 1 
paths for table default.iceberg_unicode: failed to load 1 paths. Check the 
catalog server log for more details.
{code}

In partition Hive tables the insert also returns an error, but the new 
partition is not created and the table remains usable. The error is similar to 
IMPALA-11499's



> Errors with \0 character in partition values
> --------------------------------------------
>
>                 Key: IMPALA-12987
>                 URL: https://issues.apache.org/jira/browse/IMPALA-12987
>             Project: IMPALA
>          Issue Type: Bug
>            Reporter: Csaba Ringhofer
>            Priority: Critical
>              Labels: iceberg
>
> Inserting strings with "\0" values to partition columns leads errors both in 
> Iceberg and Hive tables. 
> The issue is more severe in Iceberg tables as from this point the table can't 
> be read in Impala or Hive:
> {code}
> create table iceberg_unicode (s string, p string) partitioned by spec 
> (identity(p)) stored as iceberg;
> insert into iceberg_unicode select "a", "a\0a";
> ERROR: IcebergTableLoadingException: Error loading metadata for Iceberg table 
> hdfs://localhost:20500/test-warehouse/iceberg_unicode
> CAUSED BY: TableLoadingException: Refreshing file and block metadata for 1 
> paths for table default.iceberg_unicode: failed to load 1 paths. Check the 
> catalog server log for more details.
> {code}
> The partition directory created above seems truncated:
> hdfs://localhost:20500/test-warehouse/iceberg_unicode/data/p=a
> In partition Hive tables the insert also returns an error, but the new 
> partition is not created and the table remains usable. The error is similar 
> to IMPALA-11499's
> Note Java handles  \0 characters in unicode in a special way, which may be 
> related: 
> https://docs.oracle.com/javase/1.5.0/docs/guide/jni/spec/types.html#wp16542



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to