Hi Thomas, Tomas :), Alvaro, hackers, >> > After removing HASH_FFACTOR PostgreSQL still compiles... Would >> > removing it break some external API/extensions ? >> >> FWIW, HASH_FFACTOR has *never* been used in Postgres core code. [..] >> I think if we know that there's a 4% performance increase by removing >> the division that comes with an ability to use a custom fillfactor that >> nobody uses or has ever used, it makes sense to remove that ability.
Thomas wrote: >I think Tomas is right, and the correct fix for this would be to >compute it up front every time the hash table size changes, but since >apparently no one ever used it, +1 for just removing it and leaving it >hard-coded. [..] >For what it's worth, for dshash.c I just hard coded it to double the >hash table size whenever the number of elements in a partition >exceeded 75%. Popular hash tables in standard libraries seem to use >either .75 or 1.0 as a default or only setting. There's probably room >for discussion about numbers in those ranges, (..) Tomas also asked the same "what if you lower the fill factor, e.g. to 0.8? Does that improve performance or not?" and got some interesting links for avoiding bias. I couldn't find any other simple way to benchmark a real impact of it other than checking DB crash-recovery (if anyone has some artificial workload generator with PinBuffer->hash_search() please let me know). So I've been testing it using WAL crash-recovery using Thomas's approach [1]: always the same workload to reply, always on the same machine, same env: same 13GB WAL to recover, 8GB db, 24GB shared buffers on top of 128GB RAM, NVMe with fsync off to put dynahash as primary bottleneck (BufTableLookup()+smgropen()), append-only workload, gcc -O3: Results are <1st time is the WAL recovery timing, 2nd time is checkpoint before opening DB>. There were four measurements per each scenario, first one is cut off to avoid noise: 1. DEF_FFACTOR=1 idiv on 103.971, 3.719 103.937, 3.683 104.934, 3.691 2. DEF_FFACTOR=1 idiv off 100.565, 3.71 101.595, 3.807 100.794, 3.691 3. DEF_FFACTOR=1.2 idiv on // some proof that nobody uses it 1599552729.041 postmaster 94510 FATAL: failed to initialize hash table "SERIALIZABLEXID hash" 4. DEF_FFACTOR=0.8 idiv on // another proof ? The reason for below is that ffactor is long and as such cannot handle floats, if it would be float.... Program terminated with signal 8, Arithmetic exception. #0 0x00000000009a4119 in init_htab (nelem=4, hashp=0x17cd238) at dynahash.c:677 677 nbuckets = next_pow2_int((nelem - 1) / hctl->ffactor + 1); Missing separate debuginfos, use: debuginfo-install glibc-2.17-292.180.amzn1.x86_64 #0 0x00000000009a4119 in init_htab (nelem=4, hashp=0x17cd238) at dynahash.c:677 #1 hash_create (tabname=tabname@entry=0xb77f9a "Timezones", nelem=nelem@entry=4, info=info@entry=0x7ffd3055cf30, flags=flags@entry=16) at dynahash.c:529 #2 0x00000000009ecddf in init_timezone_hashtable () at pgtz.c:211 #3 pg_tzset (name=name@entry=0xb49fd5 "GMT") at pgtz.c:248 #4 0x00000000009ecfee in pg_timezone_initialize () at pgtz.c:372 5. DEF_FFACTOR=1 idiv on // just in case reproducer to validate measurement #1 104.333, 3.804 104.313, 3.772 103.951, 3.687 6. DEF_FFACTOR=2 idiv on 105.406, 3.783 105.33, 3.721 105.403, 3.777 7. DEF_FFACTOR=4 idiv on 105.803, 3.722 105.73, 3.728 106.104, 3.756 Some notes: - SMgrRelationHash is initialized from start at 400, while DB here is small 8GB and has only couple of tables -> no expansion needed in above test. - local backend private overflow hash table for buffers: PrivateRefCountHash is initialized at 100 and maybe it grows during - googling for DEF_FFACTOR I've found only 1 entry with value of <> 1 from 1993 (see this [2] , what's interesting is the same DEF_SEGSIZE value after all those years) Alvaro wrote: >> ... however, if we're really tense about improving some hash table's >> performance, it might be worth trying to use simplehash.h instead of >> dynahash.c for it. Thomas wrote: > Sure, but removing the dynahash IDIV also seems like a slam dunk removal of a > useless expensive feature. I agree with both, I just thought it might be interesting finding as this idiv might be (?) present in other common paths like ReadBuffer*() / PinBuffer() (some recent discussions, maybe on NUMA boxes), not just WAL recovery as it seems relatively easy to improve. -J. [1] - https://github.com/macdice/redo-bench [2] - https://fuhrwerks.com/csrg/info/93c40a660b6cdf74 ________________________________ From: Thomas Munro <thomas.mu...@gmail.com> Sent: Tuesday, September 8, 2020 2:55 AM To: Alvaro Herrera <alvhe...@2ndquadrant.com> Cc: Jakub Wartak <jakub.war...@tomtom.com>; pgsql-hackers <pgsql-hack...@postgresql.org> Subject: Re: Division in dynahash.c due to HASH_FFACTOR On Sat, Sep 5, 2020 at 2:34 AM Alvaro Herrera <alvhe...@2ndquadrant.com> wrote: > On 2020-Sep-04, Jakub Wartak wrote: > > After removing HASH_FFACTOR PostgreSQL still compiles... Would > > removing it break some external API/extensions ? > > FWIW, HASH_FFACTOR has *never* been used in Postgres core code. > > https://postgr.es/m/20160418180711.55ac82c0@fujitsu already reported > that this flag was unused a few years ago. And a search through > codesearch.debian.net shows that no software packaged by Debian uses > that flag either. > > *If* any third-party extension is using HASH_FFACTOR, it can easily be > unbroken by removing the flag or #defining it to zero; the removal would > only be a problem if anyone showed that there is a performance loss by > using the default fillfactor for some dynahash table instead of their > custom value. > > I think if we know that there's a 4% performance increase by removing > the division that comes with an ability to use a custom fillfactor that > nobody uses or has ever used, it makes sense to remove that ability. I think Tomas is right, and the correct fix for this would be to compute it up front every time the hash table size changes, but since apparently no one ever used it, +1 for just removing it and leaving it hard-coded. For what it's worth, for dshash.c I just hard coded it to double the hash table size whenever the number of elements in a partition exceeded 75%. Popular hash tables in standard libraries seem to use either .75 or 1.0 as a default or only setting. There's probably room for discussion about numbers in those ranges, but I think the concept of customisable load factors with a range much wider than that may be more relevant for disk-based hash tables (like our hash indexes), which have very different economics. I think the name HASH_FFACTOR may be a clue that this was possibly interference from disk-based hash tables from the same Berkeley people: rather than "load factor", the more usual term (?) for nentries/nbuckets in memory-based hash tables as a way to model number of expected collisions, they called it "fill factor", which I guess is because in disk-based designs your actually want a certain rate of collisions, because you're working not with elements in an array and wanting to avoid collisions while not wasting space, but rather with fixed sized buckets that you want to fill up, but not overflow. </archeological-speculation-mode> > ... however, if we're really tense about improving some hash table's > performance, it might be worth trying to use simplehash.h instead of > dynahash.c for it. Sure, but removing the dynahash IDIV also seems like a slam dunk removal of a useless expensive feature.