[
https://issues.apache.org/jira/browse/AVRO-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Thiruvalluvan M. G. updated AVRO-2045:
--------------------------------------
Component/s: java
> Avro should warn about corrupt EOF files
> ----------------------------------------
>
> Key: AVRO-2045
> URL: https://issues.apache.org/jira/browse/AVRO-2045
> Project: Apache Avro
> Issue Type: Bug
> Components: java
> Affects Versions: 1.7.6
> Reporter: Lars Volker
> Assignee: Nandor Kollar
> Priority: Major
>
> When running queries on truncated files, Impala's Avro scanner issues a
> warning:
> {noformat}
> WARNINGS: Problem parsing file
> hdfs://host.company.com:8020/tmp/datagen/some_db/some_table/col1=A/col2=B/col3=D/col4=C/2017-05-18-18-5-9-876-0.avro
> at 1327214080(EOF)
> Tried to read 64653 bytes but could only read 16549 bytes. This may indicate
> data file corruption. (file
> hdfs://host.company.com:8020/tmp/datagen/some_db/some_table/col1=A/col2=B/col3=D/col4=C/2017-05-18-18-5-9-876-0.avro,
> byte offset: 1327214080)
> {noformat}
> {{avro-tools tojson}} eventually prints the same number of rows that Impala
> reads, but does not print a warning. Instead it seems to quietly swallow the
> EOFException.
> I think it should print a warning instead.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)