[jira] [Commented] (HIVE-11827) STORED AS AVRO fails SELECT COUNT(*) when empty

Xuefu Zhang (JIRA) Fri, 18 Sep 2015 09:13:39 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-11827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14875856#comment-14875856
 ]


Xuefu Zhang commented on HIVE-11827:
------------------------------------

My understanding is that for AVRO, schema must be provided via either a literal 
or url. In other words, the schema for AVRO tables comes from external. If we 
are adding functionality, then we have to ensure that the new feature works in 
more general cases than just an empty table.

> STORED AS AVRO fails SELECT COUNT(*) when empty
> -----------------------------------------------
>
>                 Key: HIVE-11827
>                 URL: https://issues.apache.org/jira/browse/HIVE-11827
>             Project: Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>         Environment: CDH5.4.5
>            Reporter: Johndee Burks
>            Assignee: Yongzhi Chen
>            Priority: Minor
>         Attachments: HIVE-11827.1.patch
>
>
> If you create a table stored as avro and try to do select count(*) against 
> the table it will fail. The following shows this. Empty table in this 
> situation is a table with no files. 
> {code}
> hive> create table j2 (a int) stored as avro;
> OK
> Time taken: 1.069 seconds
> hive> select count(*) from j2;
> Query ID = johndee_20150915113434_d4fe99d4-7fb9-42fe-9b91-ad560eeacc48
> Total jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks determined at compile time: 1
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=<number>
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=<number>
> In order to set a constant number of reducers:
>   set mapreduce.job.reduces=<number>
> java.io.IOException: org.apache.hadoop.hive.serde2.avro.AvroSerdeException: 
> Neither avro.schema.literal nor avro.schema.url specified, can't determine 
> table schema
>       at 
> org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat.getHiveRecordWriter(AvroContainerOutputFormat.java:65)
>       at 
> org.apache.hadoop.hive.ql.exec.Utilities.createEmptyFile(Utilities.java:3430)
>       at 
> org.apache.hadoop.hive.ql.exec.Utilities.createDummyFileForEmptyPartition(Utilities.java:3463)
>       at 
> org.apache.hadoop.hive.ql.exec.Utilities.getInputPaths(Utilities.java:3387)
>       at 
> org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:370)
>       at 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:137)
>       at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
>       at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)
>       at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1640)
>       at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1399)
>       at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1183)
>       at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
>       at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1039)
>       at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:207)
>       at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159)
>       at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)
>       at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:756)
>       at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
>       at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:606)
>       at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>       at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> Caused by: org.apache.hadoop.hive.serde2.avro.AvroSerdeException: Neither 
> avro.schema.literal nor avro.schema.url specified, can't determine table 
> schema
>       at 
> org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.determineSchemaOrThrowException(AvroSerdeUtils.java:109)
>       at 
> org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat.getHiveRecordWriter(AvroContainerOutputFormat.java:63)
>       ... 24 more
> Job Submission failed with exception 
> 'java.io.IOException(org.apache.hadoop.hive.serde2.avro.AvroSerdeException: 
> Neither avro.schema.literal nor avro.schema.url specified, can't determine 
> table schema)'
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11827) STORED AS AVRO fails SELECT COUNT(*) when empty

Reply via email to