[
https://issues.apache.org/jira/browse/SPARK-38034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
zhou xiang updated SPARK-38034:
-------------------------------
Description:
{code:java}
val df = spark.range(10).selectExpr("id AS a", "id AS b", "id AS c", "id AS d")
df.selectExpr(
"sum(`d`) OVER(PARTITION BY `b`,`a`) as e",
"sum(`c`) OVER(PARTITION BY `a`) as f"
).explain
{code}
optimized plan:
{code:java}
== Physical Plan ==
*(4) Project [e#924L, f#925L]
+- Window [sum(d#43L) windowspecdefinition(b#41L, a#40L,
specifiedwindowframe(RowFrame, unboundedpreceding$(), unboundedfollowing$()))
AS e#924L], [b#41L, a#40L]
+- *(3) Sort [b#41L ASC NULLS FIRST, a#40L ASC NULLS FIRST], false, 0
+- *(3) Project [d#43L, b#41L, a#40L, f#925L]
+- Window [sum(c#42L) windowspecdefinition(a#40L,
specifiedwindowframe(RowFrame, unboundedpreceding$(), unboundedfollowing$()))
AS f#925L], [a#40L]
+- *(2) Sort [a#40L ASC NULLS FIRST], false, 0
+- Exchange hashpartitioning(a#40L, 200), true, [id=#282]
+- *(1) Project [id#38L AS d#43L, id#38L AS b#41L, id#38L AS
a#40L, id#38L AS c#42L]
+- *(1) Range (0, 10, step=1, splits=10) {code}
was:
{code:java}
val df = spark.range(10).selectExpr("id AS a", "id AS b", "id AS c", "id AS d")
df.selectExpr(
"sum(`d`) OVER(PARTITION BY `b`,`a`) as e",
"sum(`c`) OVER(PARTITION BY `a`) as f"
).explain
{code}
optimized plan
{code:java}
== Physical Plan ==
*(4) Project [e#924L, f#925L]
+- Window [sum(d#43L) windowspecdefinition(b#41L, a#40L,
specifiedwindowframe(RowFrame, unboundedpreceding$(), unboundedfollowing$()))
AS e#924L], [b#41L, a#40L]
+- *(3) Sort [b#41L ASC NULLS FIRST, a#40L ASC NULLS FIRST], false, 0
+- *(3) Project [d#43L, b#41L, a#40L, f#925L]
+- Window [sum(c#42L) windowspecdefinition(a#40L,
specifiedwindowframe(RowFrame, unboundedpreceding$(), unboundedfollowing$()))
AS f#925L], [a#40L]
+- *(2) Sort [a#40L ASC NULLS FIRST], false, 0
+- Exchange hashpartitioning(a#40L, 200), true, [id=#282]
+- *(1) Project [id#38L AS d#43L, id#38L AS b#41L, id#38L AS
a#40L, id#38L AS c#42L]
+- *(1) Range (0, 10, step=1, splits=10) {code}
> Optimize time complexity and extend applicable cases for TransposeWindow
> -------------------------------------------------------------------------
>
> Key: SPARK-38034
> URL: https://issues.apache.org/jira/browse/SPARK-38034
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 3.2.0
> Reporter: zhou xiang
> Priority: Minor
>
>
> {code:java}
> val df = spark.range(10).selectExpr("id AS a", "id AS b", "id AS c", "id AS
> d")
> df.selectExpr(
> "sum(`d`) OVER(PARTITION BY `b`,`a`) as e",
> "sum(`c`) OVER(PARTITION BY `a`) as f"
> ).explain
> {code}
> optimized plan:
>
> {code:java}
> == Physical Plan ==
> *(4) Project [e#924L, f#925L]
> +- Window [sum(d#43L) windowspecdefinition(b#41L, a#40L,
> specifiedwindowframe(RowFrame, unboundedpreceding$(), unboundedfollowing$()))
> AS e#924L], [b#41L, a#40L]
> +- *(3) Sort [b#41L ASC NULLS FIRST, a#40L ASC NULLS FIRST], false, 0
> +- *(3) Project [d#43L, b#41L, a#40L, f#925L]
> +- Window [sum(c#42L) windowspecdefinition(a#40L,
> specifiedwindowframe(RowFrame, unboundedpreceding$(), unboundedfollowing$()))
> AS f#925L], [a#40L]
> +- *(2) Sort [a#40L ASC NULLS FIRST], false, 0
> +- Exchange hashpartitioning(a#40L, 200), true, [id=#282]
> +- *(1) Project [id#38L AS d#43L, id#38L AS b#41L, id#38L
> AS a#40L, id#38L AS c#42L]
> +- *(1) Range (0, 10, step=1, splits=10) {code}
>
>
>
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]