On Sun, 17 Aug 2003, Tom Lane wrote:

> Stephan Szabo <[EMAIL PROTECTED]> writes:
> > Basically, the first thing I noticed was that changing reltuples
> > on the pg_class row for a table affected the speed of
> > explain analyze select * from othertable where foo not in (select bar from
> > table);
> > even when the plan wasn't changing, seqscan + filter on hashed subquery.
>
> That doesn't make any sense to me --- AFAICS, only the planner pays any
> attention to reltuples, so it could only affect things via changing the
> plan.  Could we see details?

I've included a perl file that generates data like that I was using and
the output of the commands from that through psql -E on my machine.  The
times seem pretty repeatable in any order so caching and such doesn't seem
to be playing a big part.

> > Then I noted that changing sort_mem changed the point at which it would
> > choose a hashed subquery in the initial plan based on the estimated
> > tuples, but didn't seem to actually affect the real memory usage,
>
> Yeah, the hashed=subquery code doesn't make any attempt to spill to
> disk.  So if the planner's estimate is badly off, you could see actual
> usage well in excess of sort_mem.

Ah, that makes sense then.

Attachment: hashtest.pl
Description: Perl program

CREATE TABLE pktest(a int unique);
NOTICE:  CREATE TABLE / UNIQUE will create implicit index "pktest_a_key" for table 
"pktest"
CREATE TABLE
CREATE TABLE fktest(b int);
CREATE TABLE
COPY pktest FROM STDIN;
COPY fktest FROM STDIN;
set sort_mem=50000;
SET
update pg_class set reltuples=1000000 where relname='pktest';
UPDATE 1
explain select * from fktest where b not in (select a from pktest);
                               QUERY PLAN                               
------------------------------------------------------------------------
 Seq Scan on fktest  (cost=12510.00..12532.50 rows=500 width=4)
   Filter: (NOT (hashed subplan))
   SubPlan
     ->  Seq Scan on pktest  (cost=0.00..10010.00 rows=1000000 width=4)
(4 rows)

explain analyze select * from fktest where b not in (select a from pktest);
                                                       QUERY PLAN                      
                                  
-------------------------------------------------------------------------------------------------------------------------
 Seq Scan on fktest  (cost=12510.00..12532.50 rows=500 width=4) (actual 
time=6911.34..6911.34 rows=0 loops=1)
   Filter: (NOT (hashed subplan))
   SubPlan
     ->  Seq Scan on pktest  (cost=0.00..10010.00 rows=1000000 width=4) (actual 
time=0.20..3109.57 rows=1000000 loops=1)
 Total runtime: 6927.20 msec
(5 rows)

update pg_class set reltuples=100000 where relname='pktest';
UPDATE 1
explain select * from fktest where b not in (select a from pktest);
                              QUERY PLAN                              
----------------------------------------------------------------------
 Seq Scan on fktest  (cost=1260.00..1282.50 rows=500 width=4)
   Filter: (NOT (hashed subplan))
   SubPlan
     ->  Seq Scan on pktest  (cost=0.00..1010.00 rows=100000 width=4)
(4 rows)

explain analyze select * from fktest where b not in (select a from pktest);
                                                      QUERY PLAN                       
                                
-----------------------------------------------------------------------------------------------------------------------
 Seq Scan on fktest  (cost=1260.00..1282.50 rows=500 width=4) (actual 
time=7036.99..7036.99 rows=0 loops=1)
   Filter: (NOT (hashed subplan))
   SubPlan
     ->  Seq Scan on pktest  (cost=0.00..1010.00 rows=100000 width=4) (actual 
time=0.38..2574.47 rows=1000000 loops=1)
 Total runtime: 7052.24 msec
(5 rows)

update pg_class set reltuples=10000 where relname='pktest';
UPDATE 1
explain select * from fktest where b not in (select a from pktest);
                             QUERY PLAN                             
--------------------------------------------------------------------
 Seq Scan on fktest  (cost=135.00..157.50 rows=500 width=4)
   Filter: (NOT (hashed subplan))
   SubPlan
     ->  Seq Scan on pktest  (cost=0.00..110.00 rows=10000 width=4)
(4 rows)

explain analyze select * from fktest where b not in (select a from pktest);
                                                     QUERY PLAN                        
                              
---------------------------------------------------------------------------------------------------------------------
 Seq Scan on fktest  (cost=135.00..157.50 rows=500 width=4) (actual 
time=14793.11..14793.11 rows=0 loops=1)
   Filter: (NOT (hashed subplan))
   SubPlan
     ->  Seq Scan on pktest  (cost=0.00..110.00 rows=10000 width=4) (actual 
time=0.10..2635.54 rows=1000000 loops=1)
 Total runtime: 14808.24 msec
(5 rows)

update pg_class set reltuples=1000 where relname='pktest';
UPDATE 1
explain select * from fktest where b not in (select a from pktest);
                            QUERY PLAN                            
------------------------------------------------------------------
 Seq Scan on fktest  (cost=22.50..45.00 rows=500 width=4)
   Filter: (NOT (hashed subplan))
   SubPlan
     ->  Seq Scan on pktest  (cost=0.00..20.00 rows=1000 width=4)
(4 rows)

explain analyze select * from fktest where b not in (select a from pktest);
                                                    QUERY PLAN                         
                            
-------------------------------------------------------------------------------------------------------------------
 Seq Scan on fktest  (cost=22.50..45.00 rows=500 width=4) (actual 
time=94120.87..94120.87 rows=0 loops=1)
   Filter: (NOT (hashed subplan))
   SubPlan
     ->  Seq Scan on pktest  (cost=0.00..20.00 rows=1000 width=4) (actual 
time=0.10..2949.57 rows=1000000 loops=1)
 Total runtime: 94135.92 msec
(5 rows)

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Reply via email to