[
https://issues.apache.org/jira/browse/HIVE-16370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
rui miranda updated HIVE-16370:
-------------------------------
Summary: avro data type null not supported on partitioned tables (was:
avro data type null not fully supported if table is partitioned)
> avro data type null not supported on partitioned tables
> -------------------------------------------------------
>
> Key: HIVE-16370
> URL: https://issues.apache.org/jira/browse/HIVE-16370
> Project: Hive
> Issue Type: Bug
> Affects Versions: 1.1.0, 2.1.1
> Reporter: rui miranda
> Priority: Minor
>
> I was attempting to create hive tables over some partitioned avro files. It
> seems the void data type (avro null) is not supported on partitioned tables
> (i could not replicate the bug on an un-partitioned table).
> ---------------
> i managed to replicate the bug on two different hive versions.
> Hive 1.1.0-cdh5.10.0
> Hive 2.1.1-amzn-0
> ----------------
> how to replicate (avro tools are required to create the avro files):
> $ wget
> http://mirror.serversupportforum.de/apache/avro/avro-1.8.1/java/avro-tools-1.8.1.jar
> $ mkdir /tmp/avro
> $ mkdir /tmp/avro/null
> $ echo "{ \
> \"type\" : \"record\", \
> \"name\" : \"null_failure\", \
> \"namespace\" : \"org.apache.avro.null_failure\", \
> \"doc\":\"the purpose of this schema is to replicate the hive avro null
> failure\", \
> \"fields\" : [{\"name\":\"one\", \"type\":\"null\",\"default\":null}] \
> } " > /tmp/avro/null/schema.avsc
> $ echo "{\"one\":null}" > /tmp/avro/null/data.json
> $ java -jar avro-tools-1.8.1.jar fromjson --schema-file
> /tmp/avro/null/schema.avsc /tmp/avro/null/data.json > /tmp/avro/null/data.avro
> $ hdfs dfs -mkdir /tmp/avro
> $ hdfs dfs -mkdir /tmp/avro/null
> $ hdfs dfs -mkdir /tmp/avro/null/schema
> $ hdfs dfs -mkdir /tmp/avro/null/data
> $ hdfs dfs -mkdir /tmp/avro/null/data/foo=bar
> $ hdfs dfs -copyFromLocal /tmp/avro/null/schema.avsc
> /tmp/avro/null/schema/schema.avsc
> $ hdfs dfs -copyFromLocal /tmp/avro/null/data.avro
> /tmp/avro/null/data/foo=bar/data.avro
> $ hive
> hive> CREATE EXTERNAL TABLE avro_null
> PARTITIONED BY (foo string)
> ROW FORMAT SERDE
> 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> STORED as INPUTFORMAT
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> OUTPUTFORMAT
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> LOCATION
> '/tmp/avro/null/data/'
> TBLPROPERTIES (
> 'avro.schema.url'='/tmp/avro/null/schema/schema.avsc')
> ;
> OK
> Time taken: 3.127 seconds
> hive> msck repair table avro_null;
> OK
> Partitions not in metastore: avro_null:foo=bar
> Repair: Added partition to metastore avro_null:foo=bar
> Time taken: 0.712 seconds, Fetched: 2 row(s)
> hive> select * from avro_null;
> FAILED: RuntimeException org.apache.hadoop.hive.ql.metadata.HiveException:
> Failed with exception Hive internal error inside
> isAssignableFromSettablePrimitiveOI void not supported
> yet.java.lang.RuntimeException: Hive internal error inside
> isAssignableFromSettablePrimitiveOI void not supported yet.
> hive> select foo, count(1) from avro_null group by foo;
> OK
> bar 1
> Time taken: 29.806 seconds, Fetched: 1 row(s)
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)