[GitHub] [spark] blcksrx commented on a change in pull request #31771: [SPARK-34652][AVRO] Support SchemaRegistry in from_avro method

GitBox Mon, 08 Mar 2021 00:54:48 -0800


blcksrx commented on a change in pull request #31771:
URL: https://github.com/apache/spark/pull/31771#discussion_r589258693




##########
File path: 
external/avro/src/main/scala/org/apache/spark/sql/avro/functions.scala
##########
@@ -65,6 +68,31 @@ object functions {
     new Column(AvroDataToCatalyst(data.expr, jsonFormatSchema, 
options.asScala.toMap))
   }
 
+  /**
+   * Converts a binary column of Avro format into its corresponding catalyst 
value.
+   * The specified subject must match actual schema of the read data, 
otherwise the behavior
+   * is undefined: it may fail or return arbitrary result.
+   * To deserialize the data with a compatible and evolved schema, the 
expected Avro schema can be
+   * set via the option avroSchema.
+   *
+   * @param data the binary column.
+   * @param subject the subject name in the schema-registry. eg. topic: t, 
key: t-key value: t-value
+   * @param schemaRegistryUri address of the schema-registry url
+   *
+   * @since 3.0.0
+   */
+  @throws(classOf[java.io.IOException])
+  @Experimental
+  def from_avro(
+      data: Column,
+      subject: String,
+      schemaRegistryUri: String): Column = {

Review comment:
       what are your reasons for avoiding it? In my opinion, they are different 
things, and it's not really complicated to maintain this new API. Also, there 
are other things that as I mentioned Databricks already implemented this API, 
so do you agree it is better to keep these same? it would provide a better user 
experience for developers.
   Anyway, you are the boss here, I follow you. What's your idea?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] blcksrx commented on a change in pull request #31771: [SPARK-34652][AVRO] Support SchemaRegistry in from_avro method

Reply via email to