I've noticed that when the stats are wrong (like in cases where you've loaded data but reltuples hasn't been updated yet) that a hashed NOT IN seems to take a significant time penalty. Is this to be expected?
On a pktest table with 1 million integers and a dual table with a single integer and sort_mem set high enough to give a hashed subplan for the various reltuples values, I saw the following behavior for explain analyze select * from dual where a not in (select a from pktest); with reltuples=1000 for pktest, query takes about 96 seconds reltuples=10000, query takes about 15 seconds reltuples=100000, query takes about 8 seconds And the memory usage seemed to be the same even if I set sort_mem back to 1024. ---------------------------(end of broadcast)--------------------------- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match