Greg Stark <[EMAIL PROTECTED]> writes: > Martijn van Oosterhout <kleptog@svana.org> writes: >> This would mean that we wouldn't be assuming that tuples near the end >> take as long as tuples near the beginning. Except we're now dealing >> will smaller numbers, so I'm worried about error accumlation.
> Hm, that would explain why Hash joins suffer from this especially. Even when > functioning properly hashes get slower as the buckets fill up and there are > longer lists to traverse. Nope, that is certainly not the explanation, because the hash table is loaded in the (single) call of the Hash node at the start of the query. It is static all through the sampled-and-not executions of the Hash Join node, which is where our problem is. I don't see that Martijn's idea responds to the problem anyway, if it is some sort of TLB-related issue. The assumption we are making is not "tuples near the end take as long as tuples near the beginning", it is "tuples we sample take as long as tuples we don't" (both statements of course meaning "on the average"). If the act of sampling incurs overhead beyond the gettimeofday() call itself, then we are screwed, and playing around with which iterations we sample and how we do the extrapolation won't make the slightest bit of difference. I'm unsure about the TLB-flush theory because I see no evidence of any such overhead in the 8.1 timings; but on the other hand it's hard to see what else could explain the apparent dependence on targetlist width. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend