Re: [PR] Parallel bounded RANGE-frame window functions without PARTITION BY (draft) [datafusion]

via GitHub Tue, 30 Jun 2026 12:29:55 -0700


Dandandan commented on code in PR #23026:
URL: https://github.com/apache/datafusion/pull/23026#discussion_r3501337552



##########
datafusion/sqllogictest/test_files/parallel_window.slt:
##########
@@ -0,0 +1,141 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# Goal: parallelize window functions that have ORDER BY but no PARTITION BY,
+# over a bounded RANGE frame. Today this collapses to a single partition via
+# BoundedWindowAggExec::required_input_distribution -> 
Distribution::SinglePartition
+# (datafusion/physical-plan/src/windows/bounded_window_agg_exec.rs:333-340).
+
+statement ok
+set datafusion.execution.target_partitions = 4;
+
+statement ok
+set datafusion.explain.show_statistics = true;
+
+# Build four parquet files with OVERLAPPING seq ranges so range-repartitioning
+# actually has to move rows around. Each file has seq congruent to its index
+# mod 4 in [0, 100); a deterministic scramble (seq * 37 mod 100) defeats the
+# natural generate_series ordering so DataFusion does not claim output_ordering
+# on seq. Per-file min/max still come back Exact from the parquet footer.
+query I
+COPY (SELECT seq, seq % 7 AS amount FROM (
+        SELECT value AS seq FROM generate_series(0, 99, 4)
+      ) ORDER BY (seq * 37) % 100)
+TO 'test_files/scratch/parallel_window/events/0.parquet'
+STORED AS PARQUET;
+----
+25
+
+query I
+COPY (SELECT seq, seq % 7 AS amount FROM (
+        SELECT value AS seq FROM generate_series(1, 99, 4)
+      ) ORDER BY (seq * 37) % 100)
+TO 'test_files/scratch/parallel_window/events/1.parquet'
+STORED AS PARQUET;
+----
+25
+
+query I
+COPY (SELECT seq, seq % 7 AS amount FROM (
+        SELECT value AS seq FROM generate_series(2, 99, 4)
+      ) ORDER BY (seq * 37) % 100)
+TO 'test_files/scratch/parallel_window/events/2.parquet'
+STORED AS PARQUET;
+----
+25
+
+query I
+COPY (SELECT seq, seq % 7 AS amount FROM (
+        SELECT value AS seq FROM generate_series(3, 99, 4)
+      ) ORDER BY (seq * 37) % 100)
+TO 'test_files/scratch/parallel_window/events/3.parquet'
+STORED AS PARQUET;
+----
+25
+
+statement ok
+CREATE EXTERNAL TABLE events
+STORED AS PARQUET
+LOCATION 'test_files/scratch/parallel_window/events';
+
+# Bounded RANGE frame, ORDER BY only (no PARTITION BY). Canonical shape.
+# Each input partition should report Exact min/max on seq from its parquet 
footer.
+query TT
+EXPLAIN SELECT
+    seq,
+    SUM(amount) OVER (
+        ORDER BY seq
+        RANGE BETWEEN 5 PRECEDING AND CURRENT ROW
+    ) AS rolling_sum
+FROM events
+ORDER BY seq
+LIMIT 5;
+----
+logical_plan
+01)Sort: events.seq ASC NULLS LAST, fetch=5
+02)--Projection: events.seq, sum(events.amount) ORDER BY [events.seq ASC NULLS 
LAST] RANGE BETWEEN 5 PRECEDING AND CURRENT ROW AS rolling_sum
+03)----WindowAggr: windowExpr=[[sum(events.amount) ORDER BY [events.seq ASC 
NULLS LAST] RANGE BETWEEN 5 PRECEDING AND CURRENT ROW]]
+04)------TableScan: events projection=[seq, amount]
+physical_plan
+01)SortPreservingMergeExec: [seq@0 ASC NULLS LAST], fetch=5, 
statistics=[Rows=Absent, Bytes=Absent, [(Col[0]:),(Col[1]:)]]

Review Comment:
   (I think it's a bit wasteful it has to do a single threaded SPM when it did 
a RangeRepartition before)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Parallel bounded RANGE-frame window functions without PARTITION BY (draft) [datafusion]

Reply via email to