[GitHub] [spark] sandeep-katta0102 commented on pull request #37191: [SPARK-39775][CORE][AVRO] Disable validate default values when parsing Avro schemas

GitBox Tue, 19 Jul 2022 22:02:25 -0700


sandeep-katta0102 commented on PR #37191:
URL: https://github.com/apache/spark/pull/37191#issuecomment-1189831824


   > > And as per [AVRO-2035](https://issues.apache.org/jira/browse/AVRO-2035) 
it is a correctness bug, so if we are fixing this issue isn't it be a great to 
make it configurable and let user decide what to do with the incorrect schema. 
?. May be can we add AvroOption as `validateDefaults` 
`spark.read.option("validateDefaults", false).format("avro")` ?
   > 
   > Hi, @sandeep-katta0102 . Could you elaborate what makes you think that is 
a correctness bug? It looks like causing a runtime exception instead of data 
corruption.
   
   As per the jira description from 
[AVRO-2035](https://issues.apache.org/jira/browse/AVRO-2035) it says "if this 
default value is ever accessed (when reading a gen1-serialized object as a 
gen2) we get this: `org.apache.avro.AvroTypeException: Non-boolean default for 
boolean: "true"` " . I am not sure whether this use case is valid for spark or 
not, if it is not valid then this comment can be ignored


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] sandeep-katta0102 commented on pull request #37191: [SPARK-39775][CORE][AVRO] Disable validate default values when parsing Avro schemas

Reply via email to