[ 
https://issues.apache.org/jira/browse/IMPALA-2229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Quanlong Huang updated IMPALA-2229:
-----------------------------------
    Epic Link: IMPALA-12887

> Inconsistent behavior between Impala and Hive when creating an Avro table 
> with an Avro schema in SERDEPROPERTIES and TBLPROPERTIES.
> -----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: IMPALA-2229
>                 URL: https://issues.apache.org/jira/browse/IMPALA-2229
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Catalog
>    Affects Versions: Impala 1.3, Impala 1.4, Impala 2.0, Impala 2.1, Impala 
> 2.2
>            Reporter: Alexander Behm
>            Priority: Minor
>              Labels: incompatibility
>
> It looks like Impala and Hive search the possible locations for an Avro 
> schema in different orders. See the different behavior for Impala and Hive 
> using the following create table stmt:
> {code}
> CREATE TABLE t
> ROW FORMAT SERDE
> 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> WITH SERDEPROPERTIES
> ('avro.schema.literal'='{"name": "my_record", "type": "record",
>  "fields": [{"name": "serde_string", "type": "string"}]}')
> TBLPROPERTIES
> ('avro.schema.literal'='{"name": "my_record", "type": "record",
>  "fields": [{"name": "tblprop_string", "type": "string"}]}');
> {code}
> Run the CREATE TABLE and DESC in Hive:
> {code}
> hive> CREATE TABLE t
>     > ROW FORMAT SERDE
>     > 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
>     > WITH SERDEPROPERTIES
>     > ('avro.schema.literal'='{"name": "my_record", "type": "record",
>     >  "fields": [{"name": "serde_string", "type": "string"}]}')
>     > TBLPROPERTIES
>     > ('avro.schema.literal'='{"name": "my_record", "type": "record",
>     >  "fields": [{"name": "tblprop_string", "type": "string"}]}');
> OK
> Time taken: 0.689 seconds
> hive> desc t;
> OK
> tblprop_string        string                  from deserializer   
> Time taken: 0.224 seconds, Fetched: 1 row(s)
> hive> 
> {code}
> Run the CREATE TABLE and DESC in Impala. Note that Impala's syntax is 
> slightly different.
> {code}
> [localhost:21000] > CREATE TABLE t
>                   > WITH SERDEPROPERTIES
>                   > ('avro.schema.literal'='{"name": "my_record", "type": 
> "record",
>                   > "fields": [{"name": "serde_string", "type": "string"}]}')
>                   > STORED AS AVRO
>                   > TBLPROPERTIES
>                   > ('avro.schema.literal'='{"name": "my_record", "type": 
> "record",
>                   > "fields": [{"name": "tblprop_string", "type": 
> "string"}]}');
> Query: create TABLE t
> WITH SERDEPROPERTIES
> ('avro.schema.literal'='{"name": "my_record", "type": "record",
> "fields": [{"name": "serde_string", "type": "string"}]}')
> STORED AS AVRO
> TBLPROPERTIES
> ('avro.schema.literal'='{"name": "my_record", "type": "record",
> "fields": [{"name": "tblprop_string", "type": "string"}]}')
> WARNINGS: Ignoring column definitions in favor of Avro schema.
> The Avro schema has 1 column(s) but 0 column definition(s) were given.
> Fetched 0 row(s) in 0.32s
> [localhost:21000] > desc t;
> Query: describe t
> +--------------+--------+-------------------+
> | name         | type   | comment           |
> +--------------+--------+-------------------+
> | serde_string | string | from deserializer |
> +--------------+--------+-------------------+
> Fetched 1 row(s) in 4.83s
> {code}
> The relevant code snippets from Impala can be found in CreateTableStmt.java 
> and HdfsTable.java:
> {code}
> // Look for the schema in TBLPROPERTIES and in SERDEPROPERTIES, with the 
> latter
> // taking precedence.
> List<Map<String, String>> schemaSearchLocations = Lists.newArrayList();
> schemaSearchLocations.add(
>     getMetaStoreTable().getSd().getSerdeInfo().getParameters());
> schemaSearchLocations.add(getMetaStoreTable().getParameters());
> {code}
> We should make Impala behave consistently with Hive. However, this is an 
> incompatible change, so we will need to schedule the fix accordingly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to