[ 
https://issues.apache.org/jira/browse/HIVE-3442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13450191#comment-13450191
 ] 

Zhenxiao Luo commented on HIVE-3442:
------------------------------------

CC'd Jakob. So that if there is any AvroSerDe usage error, Jakob's comments and 
suggesions are always welcome.
                
> AvroSerDe WITH SERDEPROPERTIES 'schema.url' is not working when creating 
> external table
> ---------------------------------------------------------------------------------------
>
>                 Key: HIVE-3442
>                 URL: https://issues.apache.org/jira/browse/HIVE-3442
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.10.0
>            Reporter: Zhenxiao Luo
>            Assignee: Zhenxiao Luo
>             Fix For: 0.10.0
>
>
> After creating a table and load data into it, I could check that the table is 
> created successfully, and data is inside:
> DROP TABLE IF EXISTS ml_items;
> CREATE TABLE ml_items(id INT,
>                       title STRING,
>                       release_date STRING,
>                       video_release_date STRING,
>                       imdb_url STRING,
>                       unknown_genre TINYINT,
>                       action TINYINT,
>                       adventure TINYINT,
>                       animation TINYINT,
>                       children TINYINT,
>                       comedy TINYINT,
>                       crime TINYINT,
>                       documentary TINYINT,
>                       drama TINYINT,
>                       fantasy TINYINT,
>                       film_noir TINYINT,
>                       horror TINYINT,
>                       musical TINYINT,
>                       mystery TINYINT,
>                       romance TINYINT,
>                       sci_fi TINYINT,
>                       thriller TINYINT,
>                       war TINYINT,
>                       western TINYINT)
>   ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n'
>   STORED AS TEXTFILE;
> LOAD DATA LOCAL INPATH '../data/files/avro_items' INTO TABLE ml_items;
> select * from ml_items ORDER BY id ASC;
> While, the following create external table with AvroSerDe is not working:
> DROP TABLE IF EXISTS ml_items_as_avro;
> CREATE EXTERNAL TABLE ml_items_as_avro
>   ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
>   WITH SERDEPROPERTIES (
>     'schema.url'='${system:test.src.data.dir}/files/avro_items_schema.avsc')
>   STORED as INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
>   OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
>   LOCATION 'file:${system:test.tmp.dir}/hive-ml-items';
> describe ml_items_as_avro;
> INSERT OVERWRITE TABLE ml_items_as_avro
>   SELECT id, title,
>     imdb_url, unknown_genre, action, adventure, animation, children, comedy, 
> crime,
>     documentary, drama, fantasy, film_noir, horror, musical, mystery, romance,
>     sci_fi, thriller, war, western
>   FROM ml_items;
> ml_items_as_avro is not created with expected schema, as shown in the 
> "describe ml_items_as_avro" output. The output is below:
> PREHOOK: query: DROP TABLE IF EXISTS ml_items_as_avro
> PREHOOK: type: DROPTABLE
> POSTHOOK: query: DROP TABLE IF EXISTS ml_items_as_avro
> POSTHOOK: type: DROPTABLE
> PREHOOK: query: CREATE EXTERNAL TABLE ml_items_as_avro
>   ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
>   WITH SERDEPROPERTIES (
>     'schema.url'='/home/cloudera/Code/hive/data/files/avro_items_schema.avsc')
>   STORED as INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
>   OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
>   LOCATION 'file:/home/cloudera/Code/hive/build/ql/tmp/hive-ml-items'
> PREHOOK: type: CREATETABLE
> POSTHOOK: query: CREATE EXTERNAL TABLE ml_items_as_avro
>   ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
>   WITH SERDEPROPERTIES (
>     'schema.url'='/home/cloudera/Code/hive/data/files/avro_items_schema.avsc')
>   STORED as INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
>   OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
>   LOCATION 'file:/home/cloudera/Code/hive/build/ql/tmp/hive-ml-items'
> POSTHOOK: type: CREATETABLE
> POSTHOOK: Output: default@ml_items_as_avro
> PREHOOK: query: describe ml_items_as_avro
> PREHOOK: type: DESCTABLE
> POSTHOOK: query: describe ml_items_as_avro
> POSTHOOK: type: DESCTABLE
> error_error_error_error_error_error_error   string  from deserializer
> cannot_determine_schema string  from deserializer
> check   string  from deserializer
> schema  string  from deserializer
> url string  from deserializer
> and string  from deserializer
> literal string  from deserializer
> FAILED: SemanticException [Error 10044]: Line 3:23 Cannot insert into target 
> table because column number/types are different 'ml_items_as_avro': Table 
> insclause-0 has 7 columns, but query has 22 columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to