On 08/01/23 16:33, David Rowley wrote:
On Sun, 8 Jan 2023 at 23:21, Ankit Kumar Pandey <itsanki...@gmail.com> wrote:
Please find attached patch with addressed issues mentioned before.
Here's a quick review:
1. You can use foreach_current_index(l) to obtain the index of the list element.
2. I'd rather see you looping backwards over the list in the first
loop. I think it'll be more efficient to loop backwards as you can
just break out the loop when you see the pathkeys are not contained in
the order by pathkreys. When the optimisation does not apply that
means you only need to look at the last item in the list. You could
maybe just invent foreach_reverse() for this purpose and put it in
pg_list.h. That'll save having to manually code up the loop.
3. I don't think you should call the variable
enable_order_by_pushdown. We've a bunch of planner related GUCs that
start with enable_. That might cause a bit of confusion. Maybe just
try_sort_pushdown.
4. You should widen the scope of orderby_pathkeys and set it within
the if (enable_order_by_pushdown) scope. You can reuse this again in
the 2nd loop too. Just initialise it to NIL
5. You don't seem to be checking all WindowClauses for a runCondtion.
If you do #2 above then you can start that process in the initial
reverse loop so that you've checked them all by the time you get
around to that WindowClause again in the 2nd loop.
6. The test with "+-- Do not perform additional sort if column is
presorted". I don't think "additional" is the correct word here. I
think you want to ensure that we don't push down the ORDER BY below
the WindowAgg for this case. There is no ability to reduce the sorts
here, only move them around, which we agreed we don't want to do for
this case.
I have addressed all points 1-6 in the attached patch.
I have one doubt regarding runCondition, do we only need to ensure
that window function which has subset sort clause of main query should
not have runCondition or none of the window functions should not contain
runCondition? I have gone with later case but wanted to clarify.
Also, do you want to have a go at coding up the sort bound pushdown
too? It'll require removing the limitCount restriction and adjusting
ExecSetTupleBound() to recurse through a WindowAggState. I think it's
pretty easy. You could try it then play around with it to make sure it
works and we get the expected performance.
I tried this in the patch but kept getting `retrieved too many tuples in
a bounded sort`.
Added following code in ExecSetTupleBound which correctly found sortstate
and set bound value.
else if(IsA(child_node, WindowAggState))
{
WindowAggState *winstate = (WindowAggState *) child_node;
if (outerPlanState(winstate))
ExecSetTupleBound(tuples_needed,
outerPlanState(winstate));
}
I think problem is that are not using limit clause inside window
function (which
may need to scan all tuples) so passing bound value to
WindowAggState->sortstate
is not working as we might expect. Or maybe I am getting it wrong? I was
trying to
have top-N sort for limit clause if orderby pushdown happens.
You'll likely want to add a few more rows than the last performance test you
did or run the query
with pgbench. Running a query once that only takes 1-2ms is likely not
a reliable way to test the performance of something.
I did some profiling.
CREATE TABLE empsalary1 (
depname varchar,
empno bigint,
salary int,
enroll_date date
);
INSERT INTO empsalary1(depname, empno, salary, enroll_date)
SELECT string_agg
(substr('abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789', ceil
(random() * 62)::integer, 1000), '')
AS depname, generate_series(1, 10000000) AS empno, ceil
(random()*10000)::integer AS salary
, NOW() + (random() * (interval '90 days')) + '30 days' AS enroll_date;
1) Optimization case
EXPLAIN (ANALYZE)
SELECT empno,
depname,
min(salary) OVER (PARTITION BY depname ORDER BY empno) depminsalary,
sum(salary) OVER (PARTITION BY depname) depsalary,
count(*) OVER (ORDER BY enroll_date DESC) c
FROM empsalary1
ORDER BY depname, empno, enroll_date;
EXPLAIN (ANALYZE)
SELECT empno,
depname,
min(salary) OVER (PARTITION BY depname) depminsalary
FROM empsalary1
ORDER BY depname, empno;
In patched version:
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
-----------------------
WindowAgg (cost=93458.04..93458.08 rows=1 width=61) (actual
time=149996.006..156756.995 rows=10000000 loops=1)
-> WindowAgg (cost=93458.04..93458.06 rows=1 width=57) (actual
time=108559.126..135892.188 rows=10000000 loops=1)
-> Sort (cost=93458.04..93458.04 rows=1 width=53) (actual
time=108554.213..112564.168 rows=10000000 loops=1)
Sort Key: depname, empno, enroll_date
Sort Method: external merge Disk: 645856kB
-> WindowAgg (cost=93458.01..93458.03 rows=1 width=53) (actual
time=30386.551..62357.669 rows=10000000 lo
ops=1)
-> Sort (cost=93458.01..93458.01 rows=1 width=45)
(actual time=23260.104..26313.395 rows=10000000 l
oops=1)
Sort Key: enroll_date DESC
Sort Method: external merge Disk: 528448kB
-> Seq Scan on empsalary1 (cost=0.00..93458.00
rows=1 width=45) (actual time=0.032..4833.603
rows=10000000 loops=1)
Planning Time: 4.693 ms
Execution Time: 158161.281 ms
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
-------------
WindowAgg (cost=1903015.63..2078015.74 rows=10000006 width=39) (actual
time=40565.305..46598.984 rows=10000000 loops=1)
-> Sort (cost=1903015.63..1928015.65 rows=10000006 width=39) (actual
time=23411.837..27467.962 rows=10000000 loops=1)
Sort Key: depname, empno
Sort Method: external merge Disk: 528448kB
-> Seq Scan on empsalary1 (cost=0.00..193458.06 rows=10000006
width=39) (actual time=5.095..5751.675 rows=10000
000 loops=1)
Planning Time: 0.099 ms
Execution Time: 47415.926 ms
In master:
QUERY
PLAN
--------------------------------------------------------------------------------------------------------------------------
-------------------------------------
Incremental Sort (cost=3992645.36..4792645.79 rows=10000006 width=59) (actual
time=147130.132..160985.373 rows=10000000
loops=1)
Sort Key: depname, empno, enroll_date
Presorted Key: depname, empno
Full-sort Groups: 312500 Sort Method: quicksort Average Memory: 28kB Peak
Memory: 28kB
-> WindowAgg (cost=3992645.31..4342645.52 rows=10000006 width=59) (actual
time=147129.936..154023.147 rows=10000000 l
oops=1)
-> WindowAgg (cost=3992645.31..4192645.43 rows=10000006 width=55)
(actual time=104665.289..133089.188 rows=1000
0000 loops=1)
-> Sort (cost=3992645.31..4017645.33 rows=10000006 width=51)
(actual time=104665.257..108710.282 rows=100
00000 loops=1)
Sort Key: depname, empno
Sort Method: external merge Disk: 645856kB
-> WindowAgg (cost=1971370.63..2146370.74 rows=10000006
width=51) (actual time=28314.300..59737.949
rows=10000000 loops=1)
-> Sort (cost=1971370.63..1996370.65 rows=10000006
width=43) (actual time=21190.188..24098.59
6 rows=10000000 loops=1)
Sort Key: enroll_date DESC
Sort Method: external merge Disk: 528448kB
-> Seq Scan on empsalary1
(cost=0.00..193458.06 rows=10000006 width=43) (actual time=0.
630..5317.862 rows=10000000 loops=1)
Planning Time: 0.982 ms
Execution Time: 163369.242 ms
(16 rows)
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
-------------------
Incremental Sort (cost=3787573.31..3912573.41 rows=10000006 width=39) (actual
time=51547.195..53950.034 rows=10000000 lo
ops=1)
Sort Key: depname, empno
Presorted Key: depname
Full-sort Groups: 1 Sort Method: quicksort Average Memory: 30kB Peak
Memory: 30kB
Pre-sorted Groups: 1 Sort Method: external merge Average Disk: 489328kB
Peak Disk: 489328kB
-> WindowAgg (cost=1903015.63..2078015.74 rows=10000006 width=39) (actual
time=33413.954..39771.262 rows=10000000 loo
ps=1)
-> Sort (cost=1903015.63..1928015.65 rows=10000006 width=39) (actual
time=18991.129..21992.353 rows=10000000 lo
ops=1)
Sort Key: depname
Sort Method: external merge Disk: 528456kB
-> Seq Scan on empsalary1 (cost=0.00..193458.06 rows=10000006
width=39) (actual time=1.300..5269.729 rows
=10000000 loops=1)
Planning Time: 4.506 ms
Execution Time: 54768.697 ms
2) No optimization case:
EXPLAIN (ANALYZE)
SELECT empno,
depname,
min(salary) OVER (PARTITION BY depname ORDER BY empno) depminsalary
FROM empsalary1
ORDER BY enroll_date;
Patch:
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
--------------------------------
Sort (cost=3968191.79..3993191.80 rows=10000006 width=43) (actual
time=57863.173..60976.324 rows=10000000 loops=1)
Sort Key: enroll_date
Sort Method: external merge Disk: 528448kB
-> WindowAgg (cost=850613.62..2190279.21 rows=10000006 width=43)
(actual time=7478.966..42502.541 rows=10000000 loops
=1)
-> Gather Merge (cost=850613.62..2015279.11 rows=10000006
width=43) (actual time=7478.935..18037.001 rows=10000
000 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Sort (cost=849613.60..860030.27 rows=4166669
width=43) (actual time=7349.101..9397.713 rows=3333333 lo
ops=3)
Sort Key: depname, empno
Sort Method: external merge Disk: 181544kB
Worker 0: Sort Method: external merge Disk: 169328kB
Worker 1: Sort Method: external merge Disk: 177752kB
-> Parallel Seq Scan on empsalary1
(cost=0.00..135124.69 rows=4166669 width=43) (actual time=0.213.
.2450.635 rows=3333333 loops=3)
Planning Time: 0.100 ms
Execution Time: 63341.783 ms
master:
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
--------------------------------
Sort (cost=3968191.79..3993191.80 rows=10000006 width=43) (actual
time=54097.880..57000.806 rows=10000000 loops=1)
Sort Key: enroll_date
Sort Method: external merge Disk: 528448kB
-> WindowAgg (cost=850613.62..2190279.21 rows=10000006 width=43)
(actual time=7075.245..39200.756 rows=10000000 loops
=1)
-> Gather Merge (cost=850613.62..2015279.11 rows=10000006
width=43) (actual time=7075.217..15988.922 rows=10000
000 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Sort (cost=849613.60..860030.27 rows=4166669
width=43) (actual time=6993.974..8799.701 rows=3333333 lo
ops=3)
Sort Key: depname, empno
Sort Method: external merge Disk: 171904kB
Worker 0: Sort Method: external merge Disk: 178496kB
Worker 1: Sort Method: external merge Disk: 178224kB
-> Parallel Seq Scan on empsalary1
(cost=0.00..135124.69 rows=4166669 width=43) (actual time=0.044.
.2683.598 rows=3333333 loops=3)
Planning Time: 5.718 ms
Execution Time: 59188.469 ms
(15 rows)
Master and patch have same performance as plan is same.
pgbench (this is to find average performance):
create table empsalary2 as select * from empsalary1 limit 1000;
-------------------------------------------------------------------
test.sql
SELECT empno,
depname,
min(salary) OVER (PARTITION BY depname ORDER BY empno) depminsalary,
sum(salary) OVER (PARTITION BY depname) depsalary,
count(*) OVER (ORDER BY enroll_date DESC) c
FROM empsalary2
ORDER BY depname, empno, enroll_date;
SELECT empno,
depname,
min(salary) OVER (PARTITION BY depname) depminsalary
FROM empsalary2
ORDER BY depname, empno;
----------------------------------------------------------------------
/usr/local/pgsql/bin/pgbench -d test -c 10 -j 4 -t 1000 -f test.sql
Patch:
transaction type: test.sql
scaling factor: 1
query mode: simple
number of clients: 10
number of threads: 4
maximum number of tries: 1
number of transactions per client: 1000
number of transactions actually processed: 10000/10000
number of failed transactions: 0 (0.000%)
latency average = 55.262 ms
initial connection time = 8.480 ms
tps = 180.957685 (without initial connection time)
Master:
transaction type: test.sql
scaling factor: 1
query mode: simple
number of clients: 10
number of threads: 4
maximum number of tries: 1
number of transactions per client: 1000
number of transactions actually processed: 10000/10000
number of failed transactions: 0 (0.000%)
latency average = 60.489 ms
initial connection time = 7.069 ms
tps = 165.320205 (without initial connection time)
TPS of patched version is higher than that of master's for same set of
queries.
where optimization is performed.
--
Regards,
Ankit Kumar Pandey
From f3dd5a27f63030dea8d1530896daf4ba1c8a5265 Mon Sep 17 00:00:00 2001
From: Ankit Kumar Pandey <itsanki...@gmail.com>
Date: Sat, 7 Jan 2023 19:10:24 +0530
Subject: [PATCH] Additional sort for root's order by can be optimized if
sorting for extra columns are performed while sorting for window functions
(if sort clause of window functions are subset of root's sort clause).
---
src/backend/optimizer/plan/planner.c | 68 +++++++++-
src/include/nodes/pg_list.h | 14 ++
src/test/regress/expected/window.out | 187 +++++++++++++++++++++++++++
src/test/regress/sql/window.sql | 94 ++++++++++++++
4 files changed, 362 insertions(+), 1 deletion(-)
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 000d757bdd..10d68e2fc3 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -4423,6 +4423,13 @@ create_one_window_path(PlannerInfo *root,
PathTarget *window_target;
ListCell *l;
List *topqual = NIL;
+ bool has_runcondition = false;
+
+ List *window_pathkeys;
+ List *orderby_pathkeys = NIL;
+ int index = 0;
+ int i;
+ bool try_sort_pushdown = false;
/*
* Since each window clause could require a different sort order, we stack
@@ -4441,10 +4448,41 @@ create_one_window_path(PlannerInfo *root,
*/
window_target = input_target;
+ try_sort_pushdown = (root->parse->distinctClause == NIL &&
+ root->parse->sortClause != NIL &&
+ root->parse->limitCount == NULL &&
+ !contain_window_function((Node *) get_sortgrouplist_exprs(root->parse->sortClause,
+ root->processed_tlist)));
+
+
+ if (try_sort_pushdown)
+ {
+ /*
+ * Search for the window function whose path keys are
+ * subset of orderby_path keys, this allows us to perform
+ * order by pushdown from this position of window function onwards
+ */
+ foreach_reverse(l, activeWindows)
+ {
+ WindowClause *wc = lfirst_node(WindowClause, l);
+ orderby_pathkeys = make_pathkeys_for_sortclauses(root,
+ root->parse->sortClause,
+ root->processed_tlist);
+ window_pathkeys = make_pathkeys_for_window(root,
+ wc,
+ root->processed_tlist);
+ has_runcondition |= (wc->runCondition != NIL);
+ if (!pathkeys_contained_in(window_pathkeys, orderby_pathkeys) || has_runcondition)
+ break;
+
+ index = foreach_current_index(l);
+ }
+ }
+
+ i=0;
foreach(l, activeWindows)
{
WindowClause *wc = lfirst_node(WindowClause, l);
- List *window_pathkeys;
int presorted_keys;
bool is_sorted;
bool topwindow;
@@ -4457,6 +4495,34 @@ create_one_window_path(PlannerInfo *root,
path->pathkeys,
&presorted_keys);
+ /*
+ * If not sorted already and the query has an ORDER BY clause, then we
+ * try to push the entire ORDER BY below the final WindowAgg. We
+ * don't do this when the query has a DISTINCT clause as the planner
+ * might opt to do a Hash Aggregate and spoil our efforts here. We
+ * also can't do this when the ORDER BY contains any WindowFuncs as we
+ * obviously can't sort something that's yet to be evaluated. We also
+ * don't bother with this when there is a WindowClauses with a
+ * runCondition as those can reduce the number of rows to sort in the
+ * ORDER BY and we don't want add complexity to the WindowAgg sort if
+ * the final ORDER BY sort is going to have far fewer rows to sort.
+ * For the same reason, don't bother if the query has a LIMIT clause.
+ */
+ if (try_sort_pushdown && !is_sorted && (i == index))
+ {
+ orderby_pathkeys = make_pathkeys_for_sortclauses(root,
+ root->parse->sortClause,
+ root->processed_tlist);
+
+ if (pathkeys_contained_in(window_pathkeys, orderby_pathkeys))
+ window_pathkeys = orderby_pathkeys;
+
+ is_sorted = pathkeys_count_contained_in(window_pathkeys,
+ path->pathkeys,
+ &presorted_keys);
+ }
+ i++;
+
/* Sort if necessary */
if (!is_sorted)
{
diff --git a/src/include/nodes/pg_list.h b/src/include/nodes/pg_list.h
index 529a382d28..82d108f672 100644
--- a/src/include/nodes/pg_list.h
+++ b/src/include/nodes/pg_list.h
@@ -378,6 +378,20 @@ lnext(const List *l, const ListCell *c)
(cell = NULL, false); \
cell##__state.i++)
+/*
+ * foreach_reverse -
+ * a convenience macro for looping through a list in reverse
+ *
+ * This is equivalent to foreach, only in reverse
+ */
+#define foreach_reverse(cell, lst) \
+ for (ForEachState cell##__state = {(lst), cell##__state.l->length-1}; \
+ (cell##__state.l != NIL && \
+ cell##__state.i >= 0) ? \
+ (cell = &cell##__state.l->elements[cell##__state.i], true) : \
+ (cell = NULL, false); \
+ cell##__state.i--)
+
/*
* foreach_delete_current -
* delete the current list element from the List associated with a
diff --git a/src/test/regress/expected/window.out b/src/test/regress/expected/window.out
index 90e89fb5b6..d83fea8091 100644
--- a/src/test/regress/expected/window.out
+++ b/src/test/regress/expected/window.out
@@ -3982,7 +3982,194 @@ ORDER BY depname, empno;
-> Seq Scan on empsalary
(11 rows)
+-- Do not perform sort pushdown if column is presorted
+CREATE INDEX depname_idx ON empsalary(depname);
+SET enable_seqscan=0;
+EXPLAIN (COSTS OFF)
+SELECT empno,
+ min(salary) OVER (PARTITION BY depname) depminsalary
+FROM empsalary
+ORDER BY depname, empno;
+ QUERY PLAN
+-------------------------------------------------------
+ Incremental Sort
+ Sort Key: depname, empno
+ Presorted Key: depname
+ -> WindowAgg
+ -> Index Scan using depname_idx on empsalary
+(5 rows)
+
+-- Do perform sort pushdown if columns are only partially sorted
+EXPLAIN (COSTS OFF)
+SELECT empno,
+ min(salary) OVER (PARTITION BY depname, empno) depminsalary
+FROM empsalary
+ORDER BY depname, empno, enroll_date;
+ QUERY PLAN
+-------------------------------------------------------
+ WindowAgg
+ -> Incremental Sort
+ Sort Key: depname, empno, enroll_date
+ Presorted Key: depname
+ -> Index Scan using depname_idx on empsalary
+(5 rows)
+
+DROP INDEX depname_idx;
+RESET enable_seqscan;
RESET enable_hashagg;
+-- If window function's ORDER BY is subset of QUERY's ORDER BY, sort as per as
+-- QUERY's ORDER BY to save additional sorting in the end
+EXPLAIN (COSTS OFF)
+SELECT empno,
+ depname,
+ min(salary) OVER (ORDER BY depname) depminsalary
+FROM empsalary
+ORDER BY depname, empno;
+ QUERY PLAN
+-----------------------------------
+ WindowAgg
+ -> Sort
+ Sort Key: depname, empno
+ -> Seq Scan on empsalary
+(4 rows)
+
+-- Same as above, but PARTITION BY clause is included as well
+EXPLAIN (COSTS OFF)
+SELECT empno,
+ depname,
+ min(salary) OVER (PARTITION BY depname ORDER BY empno) depminsalary
+FROM empsalary
+ORDER BY depname, empno, enroll_date;
+ QUERY PLAN
+-----------------------------------------------
+ WindowAgg
+ -> Sort
+ Sort Key: depname, empno, enroll_date
+ -> Seq Scan on empsalary
+(4 rows)
+
+-- Negation of above, when sort optimization is not needed to be performed
+EXPLAIN (COSTS OFF)
+SELECT empno,
+ depname,
+ min(salary) OVER (PARTITION BY depname ORDER BY empno) depminsalary
+FROM empsalary
+ORDER BY empno, enroll_date;
+ QUERY PLAN
+-----------------------------------------
+ Sort
+ Sort Key: empno, enroll_date
+ -> WindowAgg
+ -> Sort
+ Sort Key: depname, empno
+ -> Seq Scan on empsalary
+(6 rows)
+
+-- ORDER BY's in multiple Window functions can be combined into one
+-- if they are subset of QUERY's ORDER BY
+EXPLAIN (COSTS OFF)
+SELECT empno,
+ depname,
+ min(salary) OVER (PARTITION BY depname ORDER BY empno) depminsalary,
+ sum(salary) OVER (PARTITION BY depname) depsalary,
+ count(*) OVER (ORDER BY enroll_date DESC) c
+FROM empsalary
+ORDER BY depname, empno, enroll_date;
+ QUERY PLAN
+------------------------------------------------------
+ WindowAgg
+ -> WindowAgg
+ -> Sort
+ Sort Key: depname, empno, enroll_date
+ -> WindowAgg
+ -> Sort
+ Sort Key: enroll_date DESC
+ -> Seq Scan on empsalary
+(8 rows)
+
+-- order of window function shouldn't affect query's ORDER BY pushdown
+EXPLAIN (COSTS OFF)
+SELECT empno,
+ depname,
+ count(*) OVER (ORDER BY enroll_date DESC) c,
+ sum(salary) OVER (PARTITION BY depname) depsalary,
+ min(salary) OVER (PARTITION BY depname ORDER BY empno) depminsalary
+FROM empsalary
+ORDER BY depname, empno, enroll_date;
+ QUERY PLAN
+------------------------------------------------------
+ WindowAgg
+ -> WindowAgg
+ -> Sort
+ Sort Key: depname, empno, enroll_date
+ -> WindowAgg
+ -> Sort
+ Sort Key: enroll_date DESC
+ -> Seq Scan on empsalary
+(8 rows)
+
+-- SORT OPTIMIZATION (as in cases above) is not needed to be performed
+-- in cases, as mentioned below.
+-- Case #1: When distinct clause is included
+EXPLAIN (COSTS OFF)
+SELECT DISTINCT empno,
+ depname,
+ min(salary) OVER (PARTITION BY depname) depminsalary,
+ sum(salary) OVER (PARTITION BY depname) depsalary,
+ count(*) OVER (ORDER BY enroll_date DESC) c
+FROM empsalary
+ORDER BY depname, empno;
+ QUERY PLAN
+-------------------------------------------------------------------------------------------------------
+ Unique
+ -> Sort
+ Sort Key: depname, empno, (min(salary) OVER (?)), (sum(salary) OVER (?)), (count(*) OVER (?))
+ -> WindowAgg
+ -> Sort
+ Sort Key: depname
+ -> WindowAgg
+ -> Sort
+ Sort Key: enroll_date DESC
+ -> Seq Scan on empsalary
+(10 rows)
+
+-- Case #2: When Limit clause is specified
+EXPLAIN (COSTS OFF)
+SELECT empno,
+ depname,
+ min(salary) OVER (PARTITION BY empno) depminsalary,
+ sum(salary) OVER (PARTITION BY empno) depsalary
+FROM empsalary
+ORDER BY empno, depname LIMIT 1;
+ QUERY PLAN
+-----------------------------------------------
+ Limit
+ -> Incremental Sort
+ Sort Key: empno, depname
+ Presorted Key: empno
+ -> WindowAgg
+ -> Sort
+ Sort Key: empno
+ -> Seq Scan on empsalary
+(8 rows)
+
+-- Case #3: When ORDER BY contains any WindowFuncs
+EXPLAIN (COSTS OFF)
+SELECT empno,
+ row_number() over (PARTITION BY empno) esalary
+FROM empsalary
+ORDER BY empno, esalary;
+ QUERY PLAN
+--------------------------------------------
+ Incremental Sort
+ Sort Key: empno, (row_number() OVER (?))
+ Presorted Key: empno
+ -> WindowAgg
+ -> Sort
+ Sort Key: empno
+ -> Seq Scan on empsalary
+(7 rows)
+
-- Test Sort node reordering
EXPLAIN (COSTS OFF)
SELECT
diff --git a/src/test/regress/sql/window.sql b/src/test/regress/sql/window.sql
index b7bd0a83da..e989e93737 100644
--- a/src/test/regress/sql/window.sql
+++ b/src/test/regress/sql/window.sql
@@ -1324,8 +1324,102 @@ SELECT DISTINCT
FROM empsalary
ORDER BY depname, empno;
+-- Do not perform sort pushdown if column is presorted
+CREATE INDEX depname_idx ON empsalary(depname);
+SET enable_seqscan=0;
+
+EXPLAIN (COSTS OFF)
+SELECT empno,
+ min(salary) OVER (PARTITION BY depname) depminsalary
+FROM empsalary
+ORDER BY depname, empno;
+
+-- Do perform sort pushdown if columns are only partially sorted
+EXPLAIN (COSTS OFF)
+SELECT empno,
+ min(salary) OVER (PARTITION BY depname, empno) depminsalary
+FROM empsalary
+ORDER BY depname, empno, enroll_date;
+
+DROP INDEX depname_idx;
+RESET enable_seqscan;
RESET enable_hashagg;
+-- If window function's ORDER BY is subset of QUERY's ORDER BY, sort as per as
+-- QUERY's ORDER BY to save additional sorting in the end
+EXPLAIN (COSTS OFF)
+SELECT empno,
+ depname,
+ min(salary) OVER (ORDER BY depname) depminsalary
+FROM empsalary
+ORDER BY depname, empno;
+
+-- Same as above, but PARTITION BY clause is included as well
+EXPLAIN (COSTS OFF)
+SELECT empno,
+ depname,
+ min(salary) OVER (PARTITION BY depname ORDER BY empno) depminsalary
+FROM empsalary
+ORDER BY depname, empno, enroll_date;
+
+-- Negation of above, when sort optimization is not needed to be performed
+EXPLAIN (COSTS OFF)
+SELECT empno,
+ depname,
+ min(salary) OVER (PARTITION BY depname ORDER BY empno) depminsalary
+FROM empsalary
+ORDER BY empno, enroll_date;
+
+-- ORDER BY's in multiple Window functions can be combined into one
+-- if they are subset of QUERY's ORDER BY
+EXPLAIN (COSTS OFF)
+SELECT empno,
+ depname,
+ min(salary) OVER (PARTITION BY depname ORDER BY empno) depminsalary,
+ sum(salary) OVER (PARTITION BY depname) depsalary,
+ count(*) OVER (ORDER BY enroll_date DESC) c
+FROM empsalary
+ORDER BY depname, empno, enroll_date;
+
+-- order of window function shouldn't affect query's ORDER BY pushdown
+EXPLAIN (COSTS OFF)
+SELECT empno,
+ depname,
+ count(*) OVER (ORDER BY enroll_date DESC) c,
+ sum(salary) OVER (PARTITION BY depname) depsalary,
+ min(salary) OVER (PARTITION BY depname ORDER BY empno) depminsalary
+FROM empsalary
+ORDER BY depname, empno, enroll_date;
+
+-- SORT OPTIMIZATION (as in cases above) is not needed to be performed
+-- in cases, as mentioned below.
+
+-- Case #1: When distinct clause is included
+EXPLAIN (COSTS OFF)
+SELECT DISTINCT empno,
+ depname,
+ min(salary) OVER (PARTITION BY depname) depminsalary,
+ sum(salary) OVER (PARTITION BY depname) depsalary,
+ count(*) OVER (ORDER BY enroll_date DESC) c
+FROM empsalary
+ORDER BY depname, empno;
+
+-- Case #2: When Limit clause is specified
+EXPLAIN (COSTS OFF)
+SELECT empno,
+ depname,
+ min(salary) OVER (PARTITION BY empno) depminsalary,
+ sum(salary) OVER (PARTITION BY empno) depsalary
+FROM empsalary
+ORDER BY empno, depname LIMIT 1;
+
+-- Case #3: When ORDER BY contains any WindowFuncs
+EXPLAIN (COSTS OFF)
+SELECT empno,
+ row_number() over (PARTITION BY empno) esalary
+FROM empsalary
+ORDER BY empno, esalary;
+
-- Test Sort node reordering
EXPLAIN (COSTS OFF)
SELECT
--
2.37.2