[GitHub] spark pull request #18113: [SPARK-20890][SQL] Added min and max typed aggreg...

setjet Fri, 24 Nov 2017 12:02:48 -0800

Github user setjet commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18113#discussion_r153022290
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/typedaggregators.scala
 ---
    @@ -99,3 +94,91 @@ class TypedAverage[IN](val f: IN => Double) extends 
Aggregator[IN, (Double, Long
         toColumn.asInstanceOf[TypedColumn[IN, java.lang.Double]]
       }
     }
    +
    +class TypedMinDouble[IN](val f: IN => Double) extends Aggregator[IN, 
Double, Double] {
    +  override def zero: Double = Double.PositiveInfinity
    +  override def reduce(b: Double, a: IN): Double = math.min(b, f(a))
    +  override def merge(b1: Double, b2: Double): Double = math.min(b1, b2)
    +  override def finish(reduction: Double): Double = {
    +    if (Double.PositiveInfinity == reduction) {
    --- End diff --
    
    Doesn't that boil down to what was there previously? 
https://github.com/apache/spark/pull/18113/commits/51783b55197cea6c130722838ec97ad6df5c92be
    
    ```
      override def zero: java.lang.Double = null
      override def reduce(b: java.lang.Double, a: IN): java.lang.Double =
        if (b == null) f(a) else math.max(b, f(a))
    
      override def merge(b1: java.lang.Double, b2: java.lang.Double): 
java.lang.Double = {
        if (b1 == null) {
          b2
        } else if (b2 == null) {
          b1
        } else {
          math.max(b1, b2)
        }
      }
      override def finish(reduction: java.lang.Double): java.lang.Double = 
reduction
    ```
    
    Here we just return null in case its an empty set or if we have the edge 
case you just mentioned. You rejected it because you were afraid of boxing 
performance on the 8th of June.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18113: [SPARK-20890][SQL] Added min and max typed aggreg...

Reply via email to