John A Meinel <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> I don't really see why you think that this path is going to lead to >> better performance than where you were before.
> So for an IN (sub-select), does it actually pull all of the rows from > the other table, or is the planner smart enough to stop once it finds > something? It stops when it finds something --- but it's still a join operation in essence. I don't see that putting the values one by one into a table and then joining is going to be a win compared to just processing the values one at a time against the main table. > Is IN (sub-select) about the same as EXISTS (sub-select WHERE x=y)? > What about NOT IN (sub-select) versus NOT EXISTS (sub-select WHERE x=y) The EXISTS variants are actually worse, because we've not spent as much time teaching the planner how to optimize them. There's effectively only one decent plan for an EXISTS, which is that the subselect's "x" is indexed and we do an indexscan probe using the outer "y" for each outer row. IN and NOT IN can do that, or several alternative plans that might be better depending on data statistics. However, that's cold comfort for Matthew's application -- the only way he'd get any benefit from all those planner smarts is if he ANALYZEs the temp table after loading it and then EXECUTEs the main query (so that it gets re-planned every time). Plus, at least some of those alternative plans would require an index on the temp table, which is unlikely to be worth the cost of setting up. And finally, this formulation requires separate IN and NOT IN tests that are necessarily going to do a lot of redundant work. There's enough overhead here that I find it highly doubtful that it'll be a win compared to the original approach of retail queries against the main table. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster