[GitHub] [spark] LeonardoZV commented on pull request #31771: [SPARK-34652][AVRO] Support SchemaRegistry in from_avro method

GitBox Thu, 01 Apr 2021 10:37:24 -0700


LeonardoZV commented on pull request #31771:
URL: https://github.com/apache/spark/pull/31771#issuecomment-812063145



   My humble opinion: I think Spark should somehow support Confluent SR.
   
   The use of event driven architecture is skyrockting and big companies like 
to control event metadata so they don't lose control of what is being shared in 
the message bus (data governance) and Confluent SR is the main player here. I 
think that not supporting will make big companies go elsewhere. Projects like 
Apache Camel and AWS products (ex: Glue) already support it due to its 
importance. I work for the biggest private Brazillian Bank and i dont see us 
using Spark without it. 
   
   I just started learning Spark but i know Kafka very well and i think there's 
one more thing to discuss here: Confluent SR supports multiple schemas per 
topic (TopicRecordNameStrategy), so if Spark should full support Schema 
Registry, it needs to somehow have a way to deal with it in addition to the 
schema evolution. To do that today we would need different consumers (one for 
each schema) with filters right (still learning Spark)? That sounds wasteful.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] LeonardoZV commented on pull request #31771: [SPARK-34652][AVRO] Support SchemaRegistry in from_avro method

Reply via email to