[GitHub] spark pull request #19324: [SPARK-22103] Move HashAggregateExec parent consu...

juliuszsompolski Fri, 22 Sep 2017 09:03:08 -0700

Github user juliuszsompolski commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19324#discussion_r140532783
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala
 ---
    @@ -462,18 +464,36 @@ case class HashAggregateExec(
            $evaluateAggResults
            ${consume(ctx, resultVars)}
            """
    -
         } else if (modes.contains(Partial) || modes.contains(PartialMerge)) {
    -      // This should be the last operator in a stage, we should output 
UnsafeRow directly
    --- End diff --
    
    tangent fix: The partial aggregation doesn't necessarily have to be the 
last operator in the stage. E.g. if the shuffle requirement between the 
partial/final aggregation was already satisfied, or between 2. and 3. in 
`planAggregateWithOneDistinct`. Outputting the UnsafeRow through 
UnsafeRowJoiner was unnecessary then.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #19324: [SPARK-22103] Move HashAggregateExec parent consu...

Reply via email to