On Sun, 17 Aug 2003, Tom Lane wrote: > Stephan Szabo <[EMAIL PROTECTED]> writes: > > Basically, the first thing I noticed was that changing reltuples > > on the pg_class row for a table affected the speed of > > explain analyze select * from othertable where foo not in (select bar from > > table); > > even when the plan wasn't changing, seqscan + filter on hashed subquery. > > That doesn't make any sense to me --- AFAICS, only the planner pays any > attention to reltuples, so it could only affect things via changing the > plan. Could we see details?
I've included a perl file that generates data like that I was using and the output of the commands from that through psql -E on my machine. The times seem pretty repeatable in any order so caching and such doesn't seem to be playing a big part. > > Then I noted that changing sort_mem changed the point at which it would > > choose a hashed subquery in the initial plan based on the estimated > > tuples, but didn't seem to actually affect the real memory usage, > > Yeah, the hashed=subquery code doesn't make any attempt to spill to > disk. So if the planner's estimate is badly off, you could see actual > usage well in excess of sort_mem. Ah, that makes sense then.
hashtest.pl
Description: Perl program
CREATE TABLE pktest(a int unique); NOTICE: CREATE TABLE / UNIQUE will create implicit index "pktest_a_key" for table "pktest" CREATE TABLE CREATE TABLE fktest(b int); CREATE TABLE COPY pktest FROM STDIN; COPY fktest FROM STDIN; set sort_mem=50000; SET update pg_class set reltuples=1000000 where relname='pktest'; UPDATE 1 explain select * from fktest where b not in (select a from pktest); QUERY PLAN ------------------------------------------------------------------------ Seq Scan on fktest (cost=12510.00..12532.50 rows=500 width=4) Filter: (NOT (hashed subplan)) SubPlan -> Seq Scan on pktest (cost=0.00..10010.00 rows=1000000 width=4) (4 rows) explain analyze select * from fktest where b not in (select a from pktest); QUERY PLAN ------------------------------------------------------------------------------------------------------------------------- Seq Scan on fktest (cost=12510.00..12532.50 rows=500 width=4) (actual time=6911.34..6911.34 rows=0 loops=1) Filter: (NOT (hashed subplan)) SubPlan -> Seq Scan on pktest (cost=0.00..10010.00 rows=1000000 width=4) (actual time=0.20..3109.57 rows=1000000 loops=1) Total runtime: 6927.20 msec (5 rows) update pg_class set reltuples=100000 where relname='pktest'; UPDATE 1 explain select * from fktest where b not in (select a from pktest); QUERY PLAN ---------------------------------------------------------------------- Seq Scan on fktest (cost=1260.00..1282.50 rows=500 width=4) Filter: (NOT (hashed subplan)) SubPlan -> Seq Scan on pktest (cost=0.00..1010.00 rows=100000 width=4) (4 rows) explain analyze select * from fktest where b not in (select a from pktest); QUERY PLAN ----------------------------------------------------------------------------------------------------------------------- Seq Scan on fktest (cost=1260.00..1282.50 rows=500 width=4) (actual time=7036.99..7036.99 rows=0 loops=1) Filter: (NOT (hashed subplan)) SubPlan -> Seq Scan on pktest (cost=0.00..1010.00 rows=100000 width=4) (actual time=0.38..2574.47 rows=1000000 loops=1) Total runtime: 7052.24 msec (5 rows) update pg_class set reltuples=10000 where relname='pktest'; UPDATE 1 explain select * from fktest where b not in (select a from pktest); QUERY PLAN -------------------------------------------------------------------- Seq Scan on fktest (cost=135.00..157.50 rows=500 width=4) Filter: (NOT (hashed subplan)) SubPlan -> Seq Scan on pktest (cost=0.00..110.00 rows=10000 width=4) (4 rows) explain analyze select * from fktest where b not in (select a from pktest); QUERY PLAN --------------------------------------------------------------------------------------------------------------------- Seq Scan on fktest (cost=135.00..157.50 rows=500 width=4) (actual time=14793.11..14793.11 rows=0 loops=1) Filter: (NOT (hashed subplan)) SubPlan -> Seq Scan on pktest (cost=0.00..110.00 rows=10000 width=4) (actual time=0.10..2635.54 rows=1000000 loops=1) Total runtime: 14808.24 msec (5 rows) update pg_class set reltuples=1000 where relname='pktest'; UPDATE 1 explain select * from fktest where b not in (select a from pktest); QUERY PLAN ------------------------------------------------------------------ Seq Scan on fktest (cost=22.50..45.00 rows=500 width=4) Filter: (NOT (hashed subplan)) SubPlan -> Seq Scan on pktest (cost=0.00..20.00 rows=1000 width=4) (4 rows) explain analyze select * from fktest where b not in (select a from pktest); QUERY PLAN ------------------------------------------------------------------------------------------------------------------- Seq Scan on fktest (cost=22.50..45.00 rows=500 width=4) (actual time=94120.87..94120.87 rows=0 loops=1) Filter: (NOT (hashed subplan)) SubPlan -> Seq Scan on pktest (cost=0.00..20.00 rows=1000 width=4) (actual time=0.10..2949.57 rows=1000000 loops=1) Total runtime: 94135.92 msec (5 rows)
---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster