On Thu, Jul 15, 2004 at 12:52:38AM -0400, Tom Lane wrote: > No, it's not missing anything. The number being reported here is the > number of rows pulled from the plan node --- but this plan node is on > the inside of a merge join, and one of the properties of merge join is > that it will do partial rescans of its inner input in the presence of > equal keys in the outer input. If you have, say, 10 occurrences of > "42" in the outer input, then any "42" rows in the inner input have to > be rescanned 10 times. EXPLAIN ANALYZE will count each of them as 10 > rows returned by the input node.
OK, that makes sense, although it seems to me as is loops= should have been something larger than 1 if the data was scanned multiple times. > The large multiple here (80-to-one overscan) says that you've got a lot > of duplicate values in the outer input. This is generally a good > situation to *not* use a mergejoin in ;-). We do have some logic in the > planner that attempts to estimate the extra cost involved in such > rescanning, but I'm not sure how accurate the cost model is. Hum, I'm not sure if I'm in the termiology here -- "outer input" in "A left join B" is A, right? But yes, I do have a lot of duplicates, that seems to match my data well. > Raising shared_buffers seems unlikely to help. I do agree with raising > sort_mem --- not so much to make the merge faster as to encourage the > thing to try a hash join instead. sort_mem is already 16384, which I thought would be plenty -- I tried increasing it to 65536 which made exactly zero difference. :-) /* Steinar */ -- Homepage: http://www.sesse.net/ ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster