[ 
https://issues.apache.org/jira/browse/AVRO-1530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14041209#comment-14041209
 ] 

Brock Noland commented on AVRO-1530:
------------------------------------

I'd propose that when reading the header EOF and IO Exception should be treated 
differently. For example the EOF exception should be propagated so that 
upstream users, e.g. Hive, can detect the difference between the two errors.

> Java DataFileStream does not allow distinguishing between empty files and 
> corrupt files
> ---------------------------------------------------------------------------------------
>
>                 Key: AVRO-1530
>                 URL: https://issues.apache.org/jira/browse/AVRO-1530
>             Project: Avro
>          Issue Type: Bug
>            Reporter: Brock Noland
>
> When writing data to HDFS, especially with Flume, it's possible to write 
> empty files. When you run Hive queries over this data, the job fails with 
> "Not a data file." from here 
> https://github.com/apache/avro/blob/trunk/lang/java/avro/src/main/java/org/apache/avro/file/DataFileStream.java#L102



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to