Hi On Tue, Jan 1, 2019 at 10:08 PM Tomas Vondra <tomas.von...@2ndquadrant.com> wrote:
> Hi, > > I've been looking at this patch today, and I think there's a couple of > issues with the costing and execution. Consider a simple table with 1M > rows: > > create table t as select i from generate_series(1,1000000) s(i); > > and these two queries, that both select 1% of rows > > select * from t fetch first 10000 rows only; > > select * from t fetch first 1 percent rows only; > > which are costed like this > > test=# explain select * from t fetch first 10000 rows only; > QUERY PLAN > ----------------------------------------------------------------- > Limit (cost=0.00..144.25 rows=10000 width=4) > -> Seq Scan on t (cost=0.00..14425.00 rows=1000000 width=4) > (2 rows) > > test=# explain select * from t fetch first 1 percent rows only; > QUERY PLAN > ----------------------------------------------------------------- > Limit (cost=0.00..14425.00 rows=1000000 width=4) > -> Seq Scan on t (cost=0.00..14425.00 rows=1000000 width=4) > (2 rows) > > There are two issues with the costs in the "percent" case. > > Firstly, the estimated number of rows should be about the same (10k) in > both cases, but the "percent" case incorrectly uses the total row count > (i.e. as if 100% rows were returned). > > Ok I will correct it It's correct that the total cost for the "percent" case is based on 100% > of rows, of course, because the code has to fetch everything and stash > it into the tuplestore in LIMIT_INITIAL phase. > > But that implies the startup cost for the "percent" case can't be 0, > because all this work has to be done before emitting the first row. > > Correct, startup cost must be equal to total cost of source data path and total cost have to be slightly greater than startup cost . I am planing to increase the total cost by limitCount * 0.1(similar to the parallel_tuple_cost) because getting tuple from tuplestore are almost similar operation to passing a tuple from worker to master backend. > So these costing aspects need fixing, IMHO. > > > The execution part of the patch seems to be working correctly, but I > think there's an improvement - we don't need to execute the outer plan > to completion before emitting the first row. For example, let's say the > outer plan produces 10000 rows in total and we're supposed to return the > first 1% of those rows. We can emit the first row after fetching the > first 100 rows, we don't have to wait for fetching all 10k rows. > > but total rows count is not given how can we determine safe to return row Regards Surafel