HyukjinKwon commented on a change in pull request #31771:
URL: https://github.com/apache/spark/pull/31771#discussion_r589278506
##########
File path:
external/avro/src/main/scala/org/apache/spark/sql/avro/functions.scala
##########
@@ -65,6 +68,31 @@ object functions {
new Column(AvroDataToCatalyst(data.expr, jsonFormatSchema,
options.asScala.toMap))
}
+ /**
+ * Converts a binary column of Avro format into its corresponding catalyst
value.
+ * The specified subject must match actual schema of the read data,
otherwise the behavior
+ * is undefined: it may fail or return arbitrary result.
+ * To deserialize the data with a compatible and evolved schema, the
expected Avro schema can be
+ * set via the option avroSchema.
+ *
+ * @param data the binary column.
+ * @param subject the subject name in the schema-registry. eg. topic: t,
key: t-key value: t-value
+ * @param schemaRegistryUri address of the schema-registry url
+ *
+ * @since 3.0.0
+ */
+ @throws(classOf[java.io.IOException])
+ @Experimental
+ def from_avro(
+ data: Column,
+ subject: String,
+ schemaRegistryUri: String): Column = {
Review comment:
The reason for avoiding is to minimize the number of APIs to the end
users, and better usability. Arguably using URL is not very common usage.
Adding an option is fine but I would like to avoid adding an API dedicated for
it.
This is Apache Spark, not the fork of other companies. I wasn't involved in
that API design in Databricks. No one is a boss in Apache projects. I am
suggesting and reviewing it in my own perspective. It's up to you to reject and
wait for other committers' opinion, and let them merge unless somebody
explicitly objects.
What's the reason of avoiding it in one option? Isn't it simpler and easier
to reason?
cc @gengliangwang too.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]