GitHub user aokolnychyi opened a pull request:

    https://github.com/apache/spark/pull/19193

    [WIP][SPARK-21896][SQL] Fix Stack Overflow when window function is nested 
inside an aggregate function

    ## What changes were proposed in this pull request?
    
    This WIP PR contains a prototype that fixes a StackOverflowError in 
``Analyzer``. Shortly speaking, Spark cannot handle window expressions inside 
aggregate functions. The root cause of the bug is the inability of 
``ExtractWindowExpressions`` to extract window expressions from aggregate 
functions.
    
    ```
    val df = Seq((1, 2), (1, 3), (2, 4)).toDF("a", "b")
    val window = Window.orderBy("a")
    
    df.groupBy().agg(max(rank().over(window))) // does not work
    df.select(rank().over(window).alias("rank")).agg(max("rank")) // works
    ```
    It would be nice to get some initial feedback since there are alternative 
ways for solving this problem.
    
    ## How was this patch tested?
    
    This PR represents only an idea and was tested manually in several 
scenarios.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/aokolnychyi/spark spark-21896

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/19193.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19193
    
----
commit c14aa2ff6161de7d45869d91e53b0b25b18ad2dd
Author: aokolnychyi <anton.okolnyc...@sap.com>
Date:   2017-09-10T19:04:38Z

    [SPARK-21896][SQL] Fix Stack Overflow when window function is nested inside 
an aggregate function

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to