Quanlong Huang created IMPALA-12889:
---------------------------------------
Summary: Changing file format to AVRO doesn't update schema using
'avro.schema.url'
Key: IMPALA-12889
URL: https://issues.apache.org/jira/browse/IMPALA-12889
Project: IMPALA
Issue Type: Bug
Components: Catalog
Reporter: Quanlong Huang
When changing the file format of a table to AVRO, the schema is not updated if
there is a tblproperty of 'avro.schema.url'. However, after a REFRESH, the
schema is updated:
{code:sql}
create table my_part_tbl(i int) partitioned by (p int) stored as parquet;
alter table my_part_tbl set tblproperties(
'avro.schema.url'='hdfs:////test-warehouse/avro_schemas/functional/alltypes.json');
alter table my_part_tbl set fileformat avro;
describe my_part_tbl
+------+------+---------+
| name | type | comment |
+------+------+---------+
| i | int | |
| p | int | |
+------+------+---------+
refresh my_part_tbl;
describe my_part_tbl
+-----------------+---------+-------------------+
| name | type | comment |
+-----------------+---------+-------------------+
| id | int | from deserializer |
| bool_col | boolean | from deserializer |
| tinyint_col | int | from deserializer |
| smallint_col | int | from deserializer |
| int_col | int | from deserializer |
| bigint_col | bigint | from deserializer |
| float_col | float | from deserializer |
| double_col | double | from deserializer |
| date_string_col | string | from deserializer |
| string_col | string | from deserializer |
| timestamp_col | string | from deserializer |
| p | int | |
+-----------------+---------+-------------------+
{code}
Note that explicitly setting the tblproperty after changing the file format to
AVRO does refresh the schema. I.e. changing fileformat before setting
'avro.schema.url' works, but setting 'avro.schema.url' before changing
fileformat doesn't work.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)