[GitHub] spark pull request #22305: [SPARK-24561][SQL][Python] User-defined window ag...

HyukjinKwon Sun, 11 Nov 2018 04:24:54 -0800

Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22305#discussion_r232485612
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/python/WindowInPandasExec.scala
 ---
    @@ -27,17 +27,62 @@ import 
org.apache.spark.api.python.{ChainedPythonFunctions, PythonEvalType}
     import org.apache.spark.rdd.RDD
     import org.apache.spark.sql.catalyst.InternalRow
     import org.apache.spark.sql.catalyst.expressions._
    -import org.apache.spark.sql.catalyst.plans.physical._
    -import org.apache.spark.sql.execution.{GroupedIterator, SparkPlan, 
UnaryExecNode}
    +import org.apache.spark.sql.catalyst.plans.physical.{AllTuples, 
ClusteredDistribution, Distribution, Partitioning}
    +import org.apache.spark.sql.execution.{ExternalAppendOnlyUnsafeRowArray, 
SparkPlan}
     import org.apache.spark.sql.execution.arrow.ArrowUtils
    -import org.apache.spark.sql.types.{DataType, StructField, StructType}
    +import org.apache.spark.sql.execution.window._
    +import org.apache.spark.sql.types._
     import org.apache.spark.util.Utils
     
    +/**
    + * This class calculates and outputs windowed aggregates over the rows in 
a single partition.
    + *
    + * It is very similar to [[WindowExec]] and has similar logic. The main 
difference is that this
    + * node doesn't not compute any window aggregation values. Instead, it 
computes the lower and
    + * upper bound for each window (i.e. window bounds) and pass the data and 
indices to python work
    + * to do the actual window aggregation.
    + *
    + * It currently materialize all data associated with the same partition 
key and pass them to
    --- End diff --
    
    tiny typo: `materialize` -> `materializes` and `pass` -> `passes`.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #22305: [SPARK-24561][SQL][Python] User-defined window ag...

Reply via email to