On Wed, 31 May 2023 at 12:59, David Rowley <dgrowle...@gmail.com> wrote: > > On Wed, 12 Apr 2023 at 21:03, David Rowley <dgrowle...@gmail.com> wrote: > > I'll add this to the "Older bugs affecting stable branches" section of > > the PG 16 open items list > > When I wrote the above, it was very soon after the feature freeze for > PG16. I wondered, since we tend not to do cost changes as part of bug > fixes due to not wanting to destabilise plans between minor versions > if we could instead just fix it in PG16 given the freeze had *just* > started. That's no longer the case, so I'm just going to move this > out from where I added it in the PG16 Open items "Live issues" section > and just add a July CF entry for it instead.
I'm keen to move this patch along. It's not a particularly interesting patch and don't expect much interest in it, but I feel it's pretty important to have the planner not accidentally choose a cheap startup plan when a WindowAgg is going to fetch the entire subplan's tuples. I've made another pass over the patch and made a bunch of cosmetic changes. As far as mechanical changes, I only changed the EXCLUDE TIES and EXCLUDE GROUP behaviour when there is no ORDER BY clause in the WindowClause. If there's no ORDER BY then subtracting 1.0 rows seems like the right thing to do rather than what the previous patch did. I (temporarily) left the DEBUG1 elog in there if anyone wants to test for themselves (saves debugger use). In the absence of that, I'm planning on just pushing it to master only tomorrow. It seems fairly low risk and unlikely to attract too much interest since it only affects startup costs of WindowAgg nodes. I'm currently thinking it's a bad idea to backpatch this but I'd consider it more if someone else thought it was a good idea or if more people came along complaining about poor plan choice in plans containing WindowAggs. Currently, it seems better not to destabilise plans in the back branches. (CC'd Tim, who reported #17862, as he may have an opinion on this) The only thought I had while looking at this again aside from what I changed was if get_windowclause_startup_tuples() should go in selfuncs.c. I wondered if it would be neater to use convert_numeric_to_scalar() instead of the code I had to add to convert the (SMALL|BIG)INT Consts in <Const> FOLLOWING to double. Aside from that reason, it seems we don't have many usages of DEFAULT_INEQ_SEL outside of selfuncs.c. I didn't feel strongly enough about this to actually move the function. The updated patch is attached. Here are the results of my testing (note the DEBUG message matches the COUNT(*) result in all cases apart from one case where COUNT(*) returns 0 and the estimated tuples is 1.0). create table ab (a int, b int); insert into ab select a,b from generate_series(1,100) a, generate_series(1,100) b; analyze ab; set client_min_messages=debug1; # select count(*) over () from ab limit 1; DEBUG: startup_tuples = 10000 count ------- 10000 (1 row) # select count(*) over (partition by a) from ab limit 1; DEBUG: startup_tuples = 100 count ------- 100 (1 row) # select count(*) over (partition by a order by b) from ab limit 1; DEBUG: startup_tuples = 1 count ------- 1 (1 row) # select count(*) over (partition by a order by b rows between current row and unbounded following) from ab limit 1; DEBUG: startup_tuples = 100 count ------- 100 (1 row) # select count(*) over (partition by a order by b rows between current row and 10 following) from ab limit 1; DEBUG: startup_tuples = 11 count ------- 11 (1 row) # select count(*) over (partition by a order by b rows between current row and 10 following exclude current row) from ab limit 1; DEBUG: startup_tuples = 10 count ------- 10 (1 row) # select count(*) over (partition by a order by b rows between current row and 10 following exclude ties) from ab limit 1; DEBUG: startup_tuples = 11 count ------- 11 (1 row) # select count(*) over (partition by a order by b range between current row and 10 following exclude ties) from ab limit 1; DEBUG: startup_tuples = 11 count ------- 11 (1 row) # select count(*) over (partition by a order by b range between current row and unbounded following exclude ties) from ab limit 1; DEBUG: startup_tuples = 100 count ------- 100 (1 row) # select count(*) over (partition by a order by b range between current row and unbounded following exclude group) from ab limit 1; DEBUG: startup_tuples = 99 count ------- 99 (1 row) # select count(*) over (partition by a order by b groups between current row and unbounded following exclude group) from ab limit 1; DEBUG: startup_tuples = 99 count ------- 99 (1 row) # select count(*) over (partition by a rows between current row and unbounded following exclude group) from ab limit 1; DEBUG: startup_tuples = 1 count ------- 0 (1 row) # select count(*) over (partition by a rows between current row and unbounded following exclude ties) from ab limit 1; DEBUG: startup_tuples = 1 count ------- 1 (1 row) # select count(*) over (partition by a order by b rows between current row and unbounded following exclude ties) from ab limit 1; DEBUG: startup_tuples = 100 count ------- 100 (1 row) # select count(*) over (partition by a order by b rows between current row and unbounded following exclude current row) from ab limit 1; DEBUG: startup_tuples = 99 count ------- 99 (1 row) # select count(*) over (partition by a order by b range between current row and 9223372036854775807 following exclude ties) from ab limit 1; DEBUG: startup_tuples = 100 count ------- 100 (1 row) David
fix_bug_17862_v3.patch
Description: Binary data