Github user icexelloss commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22305#discussion_r232084279
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/python/WindowInPandasExec.scala
 ---
    @@ -73,68 +118,147 @@ case class WindowInPandasExec(
       }
     
       /**
    -   * Create the resulting projection.
    -   *
    -   * This method uses Code Generation. It can only be used on the executor 
side.
    +   * Helper function to get all relevant helper functions and data 
structures for window bounds
        *
    -   * @param expressions unbound ordered function expressions.
    -   * @return the final resulting projection.
    +   * This function returns:
    +   * (1) Total number of window bound indices in the python input row
    +   * (2) Function from frame index to its lower bound column index in the 
python input row
    +   * (3) Function from frame index to its upper bound column index in the 
python input row
    +   * (4) Function that returns a frame requires window bound indices in 
the python input row
    +   *     (unbounded window doesn't need it)
    +   * (5) Function from frame index to its eval type
        */
    -  private[this] def createResultProjection(expressions: Seq[Expression]): 
UnsafeProjection = {
    -    val references = expressions.zipWithIndex.map { case (e, i) =>
    -      // Results of window expressions will be on the right side of 
child's output
    -      BoundReference(child.output.size + i, e.dataType, e.nullable)
    +  private def computeWindowBoundHelpers(
    +      factories: Seq[InternalRow => WindowFunctionFrame]
    +  ): (Int, Int => Int, Int => Int, Int => Boolean, Int => Int) = {
    +    val dummyRow = new SpecificInternalRow()
    --- End diff --
    
    Yes, this is for figuring out the types of each `WindowFunctionFrame`.  
These function frames are created temporary and thrown away when this function 
returns so it's not great... However in order to create the proper function 
frames we would need to know the total number window indices, so it's a bit of 
chicken and egg problem here... I don't see an easy way :(


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to