.On Thu, 13 Apr 2023 at 02:28, Andy Fan <zhihui.fan1...@gmail.com> wrote: > The concept of startup_tuples for a WindowAgg looks good to me, but I > can't follow up with the below line: > > + return clamp_row_est(partition_tuples * DEFAULT_INEQ_SEL); > > # select count(*) over() from tenk1 limit 1; > count > ------- > 10000 --> We need to scan all the tuples. > > Should we just return clamp_row_est(partition_tuples)?
For the case you've shown, it will. It's handled by this code: if (wc->orderClause == NIL) return clamp_row_est(partition_tuples); It would take something like the following to hit the code you're concerned about: explain select count(*) over(order by unique1 rows between unbounded preceding and 10*random() following) from tenk1; QUERY PLAN -------------------------------------------------------------------------------------------- WindowAgg (cost=140.23..420.29 rows=10000 width=12) -> Index Only Scan using tenk1_unique1 on tenk1 (cost=0.29..270.29 rows=10000 width=4) (2 rows) You can see the startup cost is about 33% of the total cost for that, which is from the DEFAULT_INEQ_SEL. I'm not exactly set on that having to be DEFAULT_INEQ_SEL, but I'm not really sure what we could put that's better. I don't really follow why assuming all rows are required is better. That'll just mean we favour cheap startup plans less, but there might be a case where a cheap startup plan is favourable. I was opting for a happy medium when I thought to use DEFAULT_INEQ_SEL. I also see I might need to do a bit more work on this as the following is not handled correctly: select count(*) over(rows between unbounded preceding and 10 following) from tenk1; it's assuming all rows due to lack of ORDER BY, but it seems like it should be 10 rows due to the 10 FOLLOWING end bound. David