[
https://issues.apache.org/jira/browse/HIVE-11977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941548#comment-14941548
]
Ashutosh Chauhan commented on HIVE-11977:
-----------------------------------------
Thanks for patch [~dossett]
A 0-length file is an invalid Avro file, as in Avro's {{DataFileWriter}} will
always write MAGIC header for version. Thats the reason {{DataFileReader}}
expects it and throws up when it doesn't get one.
It seems these 0 length files got there because of some faulty generator
process. Isn't it better to just not generate those 0 length files. Or,
alternatively, delete these faulty files.
> Hive should handle an external avro table with zero length files present
> ------------------------------------------------------------------------
>
> Key: HIVE-11977
> URL: https://issues.apache.org/jira/browse/HIVE-11977
> Project: Hive
> Issue Type: Bug
> Reporter: Aaron Dossett
> Assignee: Aaron Dossett
> Attachments: HIVE-11977-2.patch, HIVE-11977.patch
>
>
> If a zero length file is in the top level directory housing an external avro
> table, all hive queries on the table fail.
> This issue is that org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader
> creates a new org.apache.avro.file.DataFileReader and DataFileReader throws
> an exception when trying to read an empty file (because the empty file lacks
> the magic number marking it as avro).
> AvroGenericRecordReader should detect an empty file and then behave
> reasonably.
> Caused by: java.io.IOException: Not a data file.
> at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:102)
> at org.apache.avro.file.DataFileReader.<init>(DataFileReader.java:97)
> at
> org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader.<init>(AvroGenericRecordReader.java:81)
> at
> org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat.getRecordReader(AvroContainerInputFormat.java:51)
> at
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:246)
> ... 25 more
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)