[HACKERS] NOT IN Doesn't use Anti Joins?

2009-12-17 Thread Rod Taylor
I'm sure there is a good reason why NOT IN will not use an Anti-Join
plan equivalent to NOT EXISTS due to NULL handling, but in this
particular case the value being compared is in the PRIMARY KEY of both
structures being joined.

The NOT IN plan was killed after 10 minutes. The NOT EXISTS plan
returned data in roughly 10ms.

Is there a reason why the NOT IN plan could not use Anti-Joins when
the column being compared against is guaranteed to be NOT NULL? Too
much planner overhead to determine nullness of the column?


sk=# explain select * from source_reb_listing where listing_id not in
(select listing_id from source_reb_listing_specs) order by file_id
desc limit 5;

 QUERY
PLAN
-
 Limit  (cost=729015.39..3420463.83 rows=5 width=28)
   -  Index Scan Backward using source_reb_listing_fileid_idx on
source_reb_listing  (cost=729015.39..169537219655.96 rows=314954
width=28)
 Filter: (NOT (SubPlan 1))
 SubPlan 1
   -  Materialize  (cost=729015.39..1185280.74 rows=32810035 width=8)
 -  Seq Scan on source_reb_listing_specs
(cost=0.00..568040.35 rows=32810035 width=8)
(6 rows)

sk=# explain select * from source_reb_listing where not exists (select
* from source_reb_listing_specs as t where t.listing_id =
source_reb_listing.listing_id) order by file_id desc limit 5;
   QUERY
PLAN
-
 Limit  (cost=0.00..35.31 rows=5 width=28)
   -  Nested Loop Anti Join  (cost=0.00..3880495.87 rows=549496 width=28)
 -  Index Scan Backward using source_reb_listing_fileid_idx
on source_reb_listing  (cost=0.00..1107142.20 rows=629907 width=28)
 -  Index Scan using source_reb_listing_specs_pkey on
source_reb_listing_specs t  (cost=0.00..1592.74 rows=408 width=8)
   Index Cond: (t.listing_id = source_reb_listing.listing_id)
(5 rows)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] NOT IN Doesn't use Anti Joins?

2009-12-17 Thread Robert Haas
On Thu, Dec 17, 2009 at 9:02 AM, Rod Taylor rod.tay...@gmail.com wrote:
 Is there a reason why the NOT IN plan could not use Anti-Joins when
 the column being compared against is guaranteed to be NOT NULL? Too
 much planner overhead to determine nullness of the column?

I doubt it.  I think it's just a question of round tuits.  I think Tom
hasn't felt it worth the effort since you can work around it by
rewriting the query, and nobody else has bothered to work up a patch.
If you feel like working on it, I think it would be a nice
improvement.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers