Tim Cerexhe created SPARK-27093:
-----------------------------------

             Summary: Honor ParseMode in AvroFileFormat
                 Key: SPARK-27093
                 URL: https://issues.apache.org/jira/browse/SPARK-27093
             Project: Spark
          Issue Type: Improvement
          Components: Input/Output
    Affects Versions: 2.4.0
            Reporter: Tim Cerexhe


The Avro reader is missing the ability to handle malformed or truncated files 
like the JSON reader. Currently it throws exceptions when it encounters any bad 
or truncated record in an Avro file, causing the entire Spark job to fail from 
a single dodgy file. 

Ideally the AvroFileFormat would accept a Permissive or DropMalformed ParseMode 
like Spark's JSON format. This would enable the the Avro reader to drop bad 
records and continue processing the good records rather than abort the entire 
job. 

Obviously the default could remain as FailFastMode, which is the current 
effective behavior, so this wouldn’t break any existing users.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to