[ https://issues.apache.org/jira/browse/SPARK-42525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17691904#comment-17691904 ]
Apache Spark commented on SPARK-42525: -------------------------------------- User 'zml1206' has created a pull request for this issue: https://github.com/apache/spark/pull/40115 > collapse two adjacent windows with the same partition/order in subquery > ----------------------------------------------------------------------- > > Key: SPARK-42525 > URL: https://issues.apache.org/jira/browse/SPARK-42525 > Project: Spark > Issue Type: Improvement > Components: Spark Core > Affects Versions: 3.2.3 > Reporter: zhuml > Priority: Major > > Extend the CollapseWindow rule to collapse Window nodes, when one window in > subquery. > > {code:java} > select a, b, c, row_number() over (partition by a order by b) as d from > ( select a, b, rank() over (partition by a order by b) as c from t1) t2 > == Optimized Logical Plan == > before > Window [row_number() windowspecdefinition(a#11, b#12 ASC NULLS FIRST, > specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS > d#26], [a#11], [b#12 ASC NULLS FIRST] > +- Window [rank(b#12) windowspecdefinition(a#11, b#12 ASC NULLS FIRST, > specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS > c#25], [a#11], [b#12 ASC NULLS FIRST] > +- InMemoryRelation [a#11, b#12], StorageLevel(disk, memory, deserialized, > 1 replicas) > +- *(1) Project [_1#6 AS a#11, _2#7 AS b#12] > +- *(1) SerializeFromObject [knownnotnull(assertnotnull(input[0, > scala.Tuple2, true]))._1 AS _1#6, knownnotnull(assertnotnull(input[0, > scala.Tuple2, true]))._2 AS _2#7] > +- *(1) MapElements > org.apache.spark.sql.DataFrameSuite$$Lambda$1517/1628848368@3a479fda, obj#5: > scala.Tuple2 > +- *(1) DeserializeToObject staticinvoke(class > java.lang.Long, ObjectType(class java.lang.Long), valueOf, id#0L, true, > false, true), obj#4: java.lang.Long > +- *(1) Range (0, 10, step=1, splits=2) > after > Window [rank(b#12) windowspecdefinition(a#11, b#12 ASC NULLS FIRST, > specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS > c#25, row_number() windowspecdefinition(a#11, b#12 ASC NULLS FIRST, > specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS > d#26], [a#11], [b#12 ASC NULLS FIRST] > +- InMemoryRelation [a#11, b#12], StorageLevel(disk, memory, deserialized, 1 > replicas) > +- *(1) Project [_1#6 AS a#11, _2#7 AS b#12] > +- *(1) SerializeFromObject [knownnotnull(assertnotnull(input[0, > scala.Tuple2, true]))._1 AS _1#6, knownnotnull(assertnotnull(input[0, > scala.Tuple2, true]))._2 AS _2#7] > +- *(1) MapElements > org.apache.spark.sql.DataFrameSuite$$Lambda$1518/1928028672@4d7a64ca, obj#5: > scala.Tuple2 > +- *(1) DeserializeToObject staticinvoke(class java.lang.Long, > ObjectType(class java.lang.Long), valueOf, id#0L, true, false, true), obj#4: > java.lang.Long > +- *(1) Range (0, 10, step=1, splits=2){code} > > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org