On Mon, Jan 18, 2016 at 10:57 PM, Tomas Vondra <tomas.von...@2ndquadrant.com> wrote: >>> If this doesn't regress performance in the case where the number of >>> buckets is estimated accurately to begin with, then I think this is >>> a great idea. Can you supply some performance tests results for that >>> case, and maybe some of the other cases also? >> >> I don't see how it could regress performance, and the benchmarks I've >> done confirm that. I'll do more thorough benchmarking and post the >> results here, but not now as this patch is in 2016-01 CF and I want to >> put all my time into reviewing patches from the open commitfest. > > I've finally got to do more thorough benchmarks, and surprisingly there > seems to be a minor regression. ... > > So even if we entirely skipped the bucket build, we would not recover the > difference. Not sure what's going on here. > > I've also done some profiling using perf, but I haven't really found > anything suspicious there. > > Any ideas what might be the cause of this?
My guess is that memory locality is not as good with this patch. Consider this: copyTuple->next = hashtable->buckets[bucketno]; We've only just populated copytuple via memcpy, so that cache line is presumably in whatever place cache lines go when they are really hot. We basically make one pass over the allocated space for the hash table, filling in the hash tuples and linking things into bucket chains. OTOH, with the patch, we don't do that on the first pass, just writing the tuples. Then when we come back through we still have to do that part: hashTuple->next = hashtable->buckets[bucketno]; By the time we're doing this, especially at larger work_mem settings, we've probably pushed those cache lines out of the faster levels of the CPU cache, and we have to pull them back in again, and that takes time. If the tuples are small, we'll dirty every cache line twice; if they're larger, we might not dirty every cache line on the second pass, but just some fraction of them. I could be totally off base, but this is why I was concerned about the patch. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (firstname.lastname@example.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers