Hi, On 2016-10-01 20:19:21 +0200, Tomas Vondra wrote: > On 10/01/2016 02:44 AM, Andres Freund wrote: > > Hi, > > > > On 2016-07-26 17:43:33 -0700, Andres Freund wrote: > > > In the attached patch I've attached simplehash.h, which can be > > > customized by a bunch of macros, before being inlined. There's also a > > > patch using this for tidbitmap.c and nodeAgg/nodeSubplan/... via > > > execGrouping.c. > > > > Attached is a significantly improved version of this series. The main > > changes are: > > > > - The hash table now uses robin-hood style hash collision handling. See the > > commit message of 0002 (or simplehash.h) for an introduction. That > > significantly reduces the impact of "clustering" due to linear > > addressing. > > Interesting. Have you looked at cuckoo hashing?
Yes. I don't think it's a good fit for modern CPUs. Because it doesn't probe linearly you constantly have random uncached memory accesses. I've played with a few schemes, and they all seemed to be slower than linear probing deviations, unless you intentionally used a wrong hash-distribution, or intentionally "unbalanced" the hash-space by iterating over either end of the keyspace and moved the entries around. > > - Significant comment and correctness fixes, both in simplehash.h > > - itself, and 0003/0004. > > - a lot of other random performance improvements for the hashing code. > > > > > > Unfortunately I'm running out battery right now, so I don't want to > > re-run the benchmarks posted upthread (loading takes a while). But the > > last time I ran them all the results after the patches were better than > > before. > > > > Well, I have rather bad experience with running benchmarks on laptops anyway > - a lot of noise due to power management, etc. Well, with factor ~2x improvements thats not really a big detractor. Using a new CPU makes some forms of analysis easier (better PMUs). For longrunning tests I agree. > What about running a bigger benchmark - say, TPC-DS - and evaluating > the impact? Worthwhile, although obviously the impact will be a lot smaller, since they're not entirely bottlenecked on hash-aggs and bitmap scans. > I think a crucial part of the benchmarking will be identifying and measuring > corner cases, e.g. a lot of rows with the same key, etc. > Although that probably is not a major issue for the two places > switched to the new implementation (e.g. execGrouping merges the > duplicates to a single group, by definition). Hm. I don't really see a case where that's going to cause issues - all the execGrouping.c users only store one key, and either ignore later ones, or add the data to the initial tuple in the group. I don't really see any case where repeated keys should cause an issue for a hash table? > > This patch series currently consists out of four patches: > > - 0001 - add unlikely/likely() macros. The use here is not entirely > > mandatory, but they do provide a quite consistent improvement. There > > are other threads where they proved to be beneficial, so I see little > > reason not to introduce them. > > - 0002 - the customizable hashtable itself. It now contains documentation. > > - 0003 - convert tidbitmap.c to use the new hashmap. This includes a fix > > for the issue mentioned in [1], to improve peformance in heavily lossy > > scenarios (otherwise we could regress in some extreme cases) > > - 0004 - convert execGrouping.c, e.g. used by hash aggregates > > > > > > While not quite perfect yet, I do think this is at a state where input > > is needed. I don't want to go a lot deeper into this rabbit hole, > > before we have some agreement that this is a sensible course of action. > > > > So, is it the right time to do some benchmarking? That's one thing that's required, yes. The other is whether we can live with the uglyness of implementing templates via macros. I do think we can, but... Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers