[GitHub] spark issue #22964: [SPARK-25963] Optimize generate followed by window

2018-11-07 Thread uzadude
Github user uzadude commented on the issue: https://github.com/apache/spark/pull/22964 this is the original query. we can see the explode followed by the shuffle: ``` import org.apache.spark.sql.functions._ import org.apache.spark.sql.expressions._ val N = 1

[GitHub] spark issue #22964: [SPARK-25963] Optimize generate followed by window

2018-11-07 Thread uzadude
Github user uzadude commented on the issue: https://github.com/apache/spark/pull/22964 The whole idea is that we'll get one shuffle and it will be before the explode as the window's partition is contained in the repartition. I'll show the physical plan. ---

[GitHub] spark issue #22964: [SPARK-25963] Optimize generate followed by window

2018-11-07 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/22964 @uzadude where is this relevant? You will end up with two shuffles if you do this. --- - To unsubscribe, e-mail:

[GitHub] spark issue #22964: [SPARK-25963] Optimize generate followed by window

2018-11-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22964 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #22964: [SPARK-25963] Optimize generate followed by window

2018-11-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22964 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #22964: [SPARK-25963] Optimize generate followed by window

2018-11-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22964 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional