HyukjinKwon commented on issue #22878: [SPARK-25789][SQL] Support for Dataset of Avro URL: https://github.com/apache/spark/pull/22878#issuecomment-473126770 Yes, I was referring both untyped APIs. So, the point of adding this API is, 1. typed one (therefore it might be able to use some Avro APIs as well) and 2. presumably better performance, got it. I was wondering how much it's worth considering the codes being added here and maintenance overhead (it adds around 1000 lines). What APIs are missing comparing to the set of Avro APIs that might be able to be used within Apache Spark? Some typed APIs like https://github.com/apache/spark/pull/23763 were deprecated because it virtually means we should add all typed versions of untyped versions. For instance, I currently don't think we should add other format compatible encoders afterward although we can support a plugin approach.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
