dhruve commented on issue #23735: [SPARK-26801][SQL] Read avro types other than 
record
URL: https://github.com/apache/spark/pull/23735#issuecomment-463383894
 
 
   Ideally I would expect input formats to be backward compatible, unless there 
is a good reason not to be.
   
   I understand your views on creating robust unit tests. So lets say something 
changed in the format. In that case,  our tests would continue to pass, however 
all spark jobs reading files generated with old format end up failing - across 
organizations. IMHO it is better to address issues like this at the 
framework/library level. This tries to introduce a shade of integration test 
with the unit test, but can help identify an issue earlier - which is what I 
personally prefer.
   
   We digress from the main PR. I don't know why we didn't add support for 
reading non-record types in avro/json. We have a use case where few upstreams 
are generating avro files with primitive or non-record data. While other 
frameworks for ex. Pig can handle them, users trying to consider switching to 
spark are confused by the ability of spark to read only record types.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to