John Arbash Meinel <[EMAIL PROTECTED]> writes: > So the big issue is why does the planner think that a nested loop is > going to be more expensive than a merge join. That I don't really know.
Well, with the increased (and much more accurate) rowcount estimate, the estimated cost of the nestloop naturally went up a lot: it's proportional to the number of rows involved. It appears that the estimated cost of the mergejoin actually went *down* quite a bit (else it'd have been selected the first time too). That seems odd to me. AFAIR the only reason that would happen is that given stats about the distributions of the two join keys, the planner can recognize that one side of the merge may not need to be run to completion --- for example if one column ranges from 1..100 and the other only from 1..40, you never need to look at the values 41..100 in the first table. You can see in the explain output that this is indeed happening to some extent: -> Sort (cost=3200.13..3267.24 rows=26844 width=4) (actual time=352.324..453.352 rows=24746 loops=1) Sort Key: tool.id -> Seq Scan on tool (cost=0.00..1225.44 rows=26844 width=4) (actual time=0.024..126.826 rows=26844 loops=1) Only 24746 of the 26844 tool rows ever got read from the sort node (and even that is probably overstating matters; if there are duplicate toolid values in the lefthand input, as seems likely, then the same rows will be pulled from the sort node multiple times). However, when both sides of the merge are being explicitly sorted, as is happening here, then not running one side to completion does not save you much at all (since you had to do the sort anyway). The early-out trick only really wins when you can quit early on a more incremental subplan, such as an indexscan. So I'm pretty surprised that the planner made this pair of choices. The estimated cost of the mergejoin shouldn't have changed much with the addition of statistics, and so ISTM it should have been picked the first time too. Walt, is there anything proprietary about the contents of these tables? If you'd be willing to send me a dump off-list, I'd like to dig through what the planner is doing here. There may be a bug somewhere in the cost estimation code. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]