gengliangwang commented on a change in pull request #25419: [SPARK-28698][SQL] Support user-specified output schema in `to_avro` URL: https://github.com/apache/spark/pull/25419#discussion_r313206434
########## File path: external/avro/src/main/scala/org/apache/spark/sql/avro/CatalystDataToAvro.scala ########## @@ -19,19 +19,24 @@ package org.apache.spark.sql.avro import java.io.ByteArrayOutputStream +import org.apache.avro.Schema import org.apache.avro.generic.GenericDatumWriter import org.apache.avro.io.{BinaryEncoder, EncoderFactory} import org.apache.spark.sql.catalyst.expressions.{Expression, UnaryExpression} import org.apache.spark.sql.catalyst.expressions.codegen.{CodegenContext, ExprCode} import org.apache.spark.sql.types.{BinaryType, DataType} -case class CatalystDataToAvro(child: Expression) extends UnaryExpression { +case class CatalystDataToAvro( + child: Expression, + jsonFormatSchema: Option[String]) extends UnaryExpression { Review comment: Here I am trying to avoid parameter with a default value. The result is quite different with/without a specified schema. Also, this is consistent with `CatalystDataToAvro`. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org