[GitHub] [spark] erikerlandson commented on a change in pull request #25024: [SPARK-27296][SQL] User Defined Aggregators that do not ser/de on each input row

GitBox Mon, 30 Sep 2019 15:31:41 -0700

erikerlandson commented on a change in pull request #25024: [SPARK-27296][SQL] 
User Defined Aggregators that do not ser/de on each input row
URL: https://github.com/apache/spark/pull/25024#discussion_r329814561


 ##########
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/udaf.scala
 ##########
 @@ -450,3 +452,165 @@ case class ScalaUDAF(
 
   override def nodeName: String = udaf.getClass.getSimpleName
 }
+
+/**
+ * The internal wrapper used to hook a [[UserDefinedImperativeAggregator]] 
`udia` in the
+ * internal aggregation code path.
+ */
+case class ScalaUDIA[T](
+    children: Seq[Expression],
+    udia: UserDefinedImperativeAggregator[T],
+    mutableAggBufferOffset: Int = 0,
+    inputAggBufferOffset: Int = 0)
+  extends TypedImperativeAggregate[T]
+  with NonSQLExpression
+  with UserDefinedExpression
+  with ImplicitCastInputTypes
+  with Logging {
+
+  def dataType: DataType = udia.resultType
+
+  val inputTypes: Seq[DataType] = udia.inputSchema.map(_.dataType)
+
+  def nullable: Boolean = true
 
 Review comment:
   I took that default from the `ScalaUDAF` class above, which I tried to keep 
consistent with wherever it made sense. Possibly it could be exposed to UDIA as 
well. For example my custom aggregators would be empty (aka identity) as 
opposed to null, even on null inputs.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] erikerlandson commented on a change in pull request #25024: [SPARK-27296][SQL] User Defined Aggregators that do not ser/de on each input row

Reply via email to