[ 
https://issues.apache.org/jira/browse/HIVE-16370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

rui miranda updated HIVE-16370:
-------------------------------
    Description: 
I was attempting to create hive tables over some partitioned Avro files. It 
seems the void data type (Avro null) is not supported on partitioned tables (i 
could not replicate the bug on an un-partitioned table).

---------------

i managed to replicate the bug on two different hive versions.

Hive 1.1.0-cdh5.10.0
Hive 2.1.1-amzn-0
----------------

how to replicate (avro tools are required to create the avro files):

$ wget 
http://mirror.serversupportforum.de/apache/avro/avro-1.8.1/java/avro-tools-1.8.1.jar

$ mkdir /tmp/avro
$ mkdir /tmp/avro/null
$ echo "{ \
  \"type\" : \"record\", \
  \"name\" : \"null_failure\", \
  \"namespace\" : \"org.apache.avro.null_failure\", \
  \"doc\":\"the purpose of this schema is to replicate the hive avro null 
failure\", \
  \"fields\" : [{\"name\":\"one\", \"type\":\"null\",\"default\":null}] \
} " > /tmp/avro/null/schema.avsc
$ echo "{\"one\":null}" > /tmp/avro/null/data.json
$ java -jar avro-tools-1.8.1.jar fromjson --schema-file 
/tmp/avro/null/schema.avsc /tmp/avro/null/data.json > /tmp/avro/null/data.avro

$ hdfs dfs -mkdir /tmp/avro
$ hdfs dfs -mkdir /tmp/avro/null
$ hdfs dfs -mkdir /tmp/avro/null/schema
$ hdfs dfs -mkdir /tmp/avro/null/data
$ hdfs dfs -mkdir /tmp/avro/null/data/foo=bar
$ hdfs dfs -copyFromLocal /tmp/avro/null/schema.avsc 
/tmp/avro/null/schema/schema.avsc
$ hdfs dfs -copyFromLocal /tmp/avro/null/data.avro 
/tmp/avro/null/data/foo=bar/data.avro

$ hive 


hive> CREATE EXTERNAL TABLE avro_null
PARTITIONED BY (foo string)
  ROW FORMAT SERDE
  'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
  STORED as INPUTFORMAT
  'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
  OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
LOCATION
'/tmp/avro/null/data/'
  TBLPROPERTIES (
    'avro.schema.url'='/tmp/avro/null/schema/schema.avsc')
;



OK
Time taken: 3.127 seconds



hive> msck repair table avro_null;
OK
Partitions not in metastore:    avro_null:foo=bar
Repair: Added partition to metastore avro_null:foo=bar
Time taken: 0.712 seconds, Fetched: 2 row(s)



hive> select * from avro_null;
FAILED: RuntimeException org.apache.hadoop.hive.ql.metadata.HiveException: 
Failed with exception Hive internal error inside 
isAssignableFromSettablePrimitiveOI void not supported 
yet.java.lang.RuntimeException: Hive internal error inside 
isAssignableFromSettablePrimitiveOI void not supported yet.


hive> select foo, count(1)  from avro_null group by foo;

OK
bar     1
Time taken: 29.806 seconds, Fetched: 1 row(s)








  was:
I was attempting to create hive tables over some partitioned avro files. It 
seems the void data type (avro null) is not supported on partitioned tables (i 
could not replicate the bug on an un-partitioned table).

---------------

i managed to replicate the bug on two different hive versions.

Hive 1.1.0-cdh5.10.0
Hive 2.1.1-amzn-0
----------------

how to replicate (avro tools are required to create the avro files):

$ wget 
http://mirror.serversupportforum.de/apache/avro/avro-1.8.1/java/avro-tools-1.8.1.jar

$ mkdir /tmp/avro
$ mkdir /tmp/avro/null
$ echo "{ \
  \"type\" : \"record\", \
  \"name\" : \"null_failure\", \
  \"namespace\" : \"org.apache.avro.null_failure\", \
  \"doc\":\"the purpose of this schema is to replicate the hive avro null 
failure\", \
  \"fields\" : [{\"name\":\"one\", \"type\":\"null\",\"default\":null}] \
} " > /tmp/avro/null/schema.avsc
$ echo "{\"one\":null}" > /tmp/avro/null/data.json
$ java -jar avro-tools-1.8.1.jar fromjson --schema-file 
/tmp/avro/null/schema.avsc /tmp/avro/null/data.json > /tmp/avro/null/data.avro

$ hdfs dfs -mkdir /tmp/avro
$ hdfs dfs -mkdir /tmp/avro/null
$ hdfs dfs -mkdir /tmp/avro/null/schema
$ hdfs dfs -mkdir /tmp/avro/null/data
$ hdfs dfs -mkdir /tmp/avro/null/data/foo=bar
$ hdfs dfs -copyFromLocal /tmp/avro/null/schema.avsc 
/tmp/avro/null/schema/schema.avsc
$ hdfs dfs -copyFromLocal /tmp/avro/null/data.avro 
/tmp/avro/null/data/foo=bar/data.avro

$ hive 


hive> CREATE EXTERNAL TABLE avro_null
PARTITIONED BY (foo string)
  ROW FORMAT SERDE
  'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
  STORED as INPUTFORMAT
  'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
  OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
LOCATION
'/tmp/avro/null/data/'
  TBLPROPERTIES (
    'avro.schema.url'='/tmp/avro/null/schema/schema.avsc')
;



OK
Time taken: 3.127 seconds



hive> msck repair table avro_null;
OK
Partitions not in metastore:    avro_null:foo=bar
Repair: Added partition to metastore avro_null:foo=bar
Time taken: 0.712 seconds, Fetched: 2 row(s)



hive> select * from avro_null;
FAILED: RuntimeException org.apache.hadoop.hive.ql.metadata.HiveException: 
Failed with exception Hive internal error inside 
isAssignableFromSettablePrimitiveOI void not supported 
yet.java.lang.RuntimeException: Hive internal error inside 
isAssignableFromSettablePrimitiveOI void not supported yet.


hive> select foo, count(1)  from avro_null group by foo;

OK
bar     1
Time taken: 29.806 seconds, Fetched: 1 row(s)








        Summary: Avro data type null not supported on partitioned tables  (was: 
avro data type null not supported on partitioned tables)

> Avro data type null not supported on partitioned tables
> -------------------------------------------------------
>
>                 Key: HIVE-16370
>                 URL: https://issues.apache.org/jira/browse/HIVE-16370
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 1.1.0, 2.1.1
>            Reporter: rui miranda
>            Priority: Minor
>
> I was attempting to create hive tables over some partitioned Avro files. It 
> seems the void data type (Avro null) is not supported on partitioned tables 
> (i could not replicate the bug on an un-partitioned table).
> ---------------
> i managed to replicate the bug on two different hive versions.
> Hive 1.1.0-cdh5.10.0
> Hive 2.1.1-amzn-0
> ----------------
> how to replicate (avro tools are required to create the avro files):
> $ wget 
> http://mirror.serversupportforum.de/apache/avro/avro-1.8.1/java/avro-tools-1.8.1.jar
> $ mkdir /tmp/avro
> $ mkdir /tmp/avro/null
> $ echo "{ \
>   \"type\" : \"record\", \
>   \"name\" : \"null_failure\", \
>   \"namespace\" : \"org.apache.avro.null_failure\", \
>   \"doc\":\"the purpose of this schema is to replicate the hive avro null 
> failure\", \
>   \"fields\" : [{\"name\":\"one\", \"type\":\"null\",\"default\":null}] \
> } " > /tmp/avro/null/schema.avsc
> $ echo "{\"one\":null}" > /tmp/avro/null/data.json
> $ java -jar avro-tools-1.8.1.jar fromjson --schema-file 
> /tmp/avro/null/schema.avsc /tmp/avro/null/data.json > /tmp/avro/null/data.avro
> $ hdfs dfs -mkdir /tmp/avro
> $ hdfs dfs -mkdir /tmp/avro/null
> $ hdfs dfs -mkdir /tmp/avro/null/schema
> $ hdfs dfs -mkdir /tmp/avro/null/data
> $ hdfs dfs -mkdir /tmp/avro/null/data/foo=bar
> $ hdfs dfs -copyFromLocal /tmp/avro/null/schema.avsc 
> /tmp/avro/null/schema/schema.avsc
> $ hdfs dfs -copyFromLocal /tmp/avro/null/data.avro 
> /tmp/avro/null/data/foo=bar/data.avro
> $ hive 
> hive> CREATE EXTERNAL TABLE avro_null
> PARTITIONED BY (foo string)
>   ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
>   STORED as INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
>   OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> LOCATION
> '/tmp/avro/null/data/'
>   TBLPROPERTIES (
>     'avro.schema.url'='/tmp/avro/null/schema/schema.avsc')
> ;
> OK
> Time taken: 3.127 seconds
> hive> msck repair table avro_null;
> OK
> Partitions not in metastore:  avro_null:foo=bar
> Repair: Added partition to metastore avro_null:foo=bar
> Time taken: 0.712 seconds, Fetched: 2 row(s)
> hive> select * from avro_null;
> FAILED: RuntimeException org.apache.hadoop.hive.ql.metadata.HiveException: 
> Failed with exception Hive internal error inside 
> isAssignableFromSettablePrimitiveOI void not supported 
> yet.java.lang.RuntimeException: Hive internal error inside 
> isAssignableFromSettablePrimitiveOI void not supported yet.
> hive> select foo, count(1)  from avro_null group by foo;
> OK
> bar   1
> Time taken: 29.806 seconds, Fetched: 1 row(s)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to