Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/21352#discussion_r202725239 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -3226,7 +3218,7 @@ case class ArrayDistinct(child: Expression) override def dataType: DataType = child.dataType - @transient lazy val elementType: DataType = dataType.asInstanceOf[ArrayType].elementType + private def elementType: DataType = dataType.asInstanceOf[ArrayType].elementType --- End diff -- nit: as this is used in the eval method, with this PR we are re-evaluating this code for each row. Despite probably it is not a big issue, I'd rather not introduce perf regression. WDYT @cloud-fan ?
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org