Tim Cerexhe created AVRO-2342:
---------------------------------
Summary: Honor ParseMode in AvroFileFormat
Key: AVRO-2342
URL: https://issues.apache.org/jira/browse/AVRO-2342
Project: Apache Avro
Issue Type: Improvement
Reporter: Tim Cerexhe
The Avro reader is missing the ability to handle malformed or truncated files
like the JSON reader. Currently it throws exceptions when it encounters any bad
or truncated record in an Avro file, causing the entire Spark job to fail from
a single dodgy file.
Ideally the AvroFileFormat would accept a Permissive or DropMalformed ParseMode
like Spark's JSON format. This would enable the the Avro reader to drop bad
records and continue processing the good records rather than abort the entire
job.
Obviously the default could remain as FailFastMode, which is the current
effective behavior, so this wouldn’t break any existing users.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)