[ 
https://issues.apache.org/jira/browse/AVRO-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2045:
--------------------------------------
    Component/s: java

> Avro should warn about corrupt EOF files
> ----------------------------------------
>
>                 Key: AVRO-2045
>                 URL: https://issues.apache.org/jira/browse/AVRO-2045
>             Project: Apache Avro
>          Issue Type: Bug
>          Components: java
>    Affects Versions: 1.7.6
>            Reporter: Lars Volker
>            Assignee: Nandor Kollar
>            Priority: Major
>
> When running queries on truncated files, Impala's Avro scanner issues a 
> warning:
> {noformat}
> WARNINGS: Problem parsing file 
> hdfs://host.company.com:8020/tmp/datagen/some_db/some_table/col1=A/col2=B/col3=D/col4=C/2017-05-18-18-5-9-876-0.avro
>  at 1327214080(EOF)
> Tried to read 64653 bytes but could only read 16549 bytes. This may indicate 
> data file corruption. (file 
> hdfs://host.company.com:8020/tmp/datagen/some_db/some_table/col1=A/col2=B/col3=D/col4=C/2017-05-18-18-5-9-876-0.avro,
>  byte offset: 1327214080)
> {noformat}
> {{avro-tools tojson}} eventually prints the same number of rows that Impala 
> reads, but does not print a warning. Instead it seems to quietly swallow the 
> EOFException.
> I think it should print a warning instead.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to