[GitHub] spark pull request #19506: [SPARK-22285] [SQL] Change implementation of Appr...

cloud-fan Thu, 19 Oct 2017 08:46:46 -0700

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19506#discussion_r145741653
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproxCountDistinctForIntervals.scala
 ---
    @@ -153,13 +129,14 @@ case class ApproxCountDistinctForIntervals(
           // endpoints are sorted into ascending order already
           if (endpoints.head > doubleValue || endpoints.last < doubleValue) {
             // ignore if the value is out of the whole range
    -        return
    +        return buffer
           }
     
           val hllppIndex = findHllppIndex(doubleValue)
    -      val offset = mutableAggBufferOffset + hllppIndex * numWordsPerHllpp
    -      hllppArray(hllppIndex).update(buffer, offset, value, child.dataType)
    +      val offset = hllppIndex * numWordsPerHllpp
    +      hllppArray(hllppIndex).update(LongArrayInput(buffer), offset, value, 
child.dataType)
    --- End diff --
    
    you can just pass `InternalRow(buffer)` here, to save a lot of code 
changes. If performance matters here, you can create a `LongArrayInternalRow` 
to avoid boxing.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #19506: [SPARK-22285] [SQL] Change implementation of Appr...

Reply via email to