Hello -- I'm getting some unexpected results with a NOT IN query -- this is on 8.2.5.
This is the query in question: prod_2=> select id from feed_download_task where id in (111102466,141701504) and id not in (select last_feed_download_task_id from subscription); id ---- (0 rows) This query returns zero rows, but I expect it to return 1 row, because I know that 111102466 *is not* in (select last_feed_download_task_id from subscription) and I know that 1471701504 *is* in that set, as demonstrated below: Verify that both id values are in the feed_download_task table: prod_2=> select id from feed_download_task where id in (111102466,141701504); id ----------- 141701504 111102466 (2 rows) Verify that 111102466 is NOT in the set of last_feed_download_task_id's, and that 141701504 is in this set: prod_2=> select last_feed_download_task_id from subscription where last_feed_download_task_id in (111102466,141701504); last_feed_download_task_id ---------------------------- 141701504 (1 row) Here's the problem query again, with explain analyze. prod_2=> select id from feed_download_task where id in (111102466,141701504) and id not in (select last_feed_download_task_id from subscription); id ---- (0 rows) prod_2=> explain analyze select id from feed_download_task where id in (111102466,141701504) and id not in (select last_feed_download_task_id from subscription); QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on feed_download_task (cost=45077.24..45083.27 rows=1 width=4) (actual time=601.229..601.229 rows=0 loops=1) Recheck Cond: (id = ANY ('{111102466,141701504}'::integer[])) Filter: (NOT (hashed subplan)) -> Bitmap Index Scan on feed_download_task_pkey (cost=0.00..30.52rows=2 width=0) (actual time= 0.095..0.095 rows=2 loops=1) Index Cond: (id = ANY ('{111102466,141701504}'::integer[])) SubPlan -> Seq Scan on subscription (cost=0.00..44097.78 rows=379578 width=4) (actual time=0.032..488.193 rows=162365 loops=1) Total runtime: 601.281 ms (8 rows) I've tried re-analyzing and re-indexing the tables involved in this query, but I still left scratching my head. I am also aware that I can use a left join instead of a NOT IN query -- but in this case I need to use this in a DELETE statement, which eliminates the possibility of the left join (I think). Here is a version using a left outer join, it returns the expected result: prod_2=# select feed_download_task.id from feed_download_task left join subscription on (subscription.last_feed_download_task_id = feed_download_task.id) where feed_download_task.id in (111102466,141701504) and subscription.id IS NOT NULL; id ----------- 141701504 (1 row) Here is the explain analyze output for the above query: prod_2=# explain analyze select feed_download_task.id from feed_download_task left join subscription on ( subscription.last_feed_download_task_id = feed_download_task.id) where feed_download_task.id in (111102466,141701504) and subscription.id IS NOT NULL; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=31.19..51.69 rows=1 width=4) (actual time=0.158..0.210rows=1 loops=1) -> Bitmap Heap Scan on feed_download_task (cost=31.19..37.21 rows=2 width=4) (actual time=0.120..0.134 rows=2 loops=1) Recheck Cond: (id = ANY ('{111102466,141701504}'::integer[])) -> Bitmap Index Scan on feed_download_task_pkey (cost=0.00..31.19rows=2 width=0) (actual time= 0.102..0.102 rows=2 loops=1) Index Cond: (id = ANY ('{111102466,141701504}'::integer[])) -> Index Scan using index_subscription_on_last_feed_download_task_id on subscription (cost=0.00..7.23 rows=1 width=4) (actual time=0.036..0.037rows=0 loops=2) Index Cond: (subscription.last_feed_download_task_id = feed_download_task.id) Filter: (id IS NOT NULL) I feel like I must be missing something obvious. Thanks in advance for the assistance. cheers, Mason