On Fri, Mar 3, 2017 at 8:41 AM, Robert Haas <robertmh...@gmail.com> wrote: > On Fri, Mar 3, 2017 at 1:22 AM, Andres Freund <and...@anarazel.de> wrote: >> the resulting hash-values aren't actually meaningfully influenced by the >> IV. Because we just xor with the IV, most hash-value that without the IV >> would have fallen into a single hash-bucket, fall into a single >> hash-bucket afterwards as well; just somewhere else in the hash-range. > > Wow, OK. I had kind of assumed (without looking) that setting the > hash IV did something a little more useful than that. Maybe we should > do something like struct blah { int iv; int hv; }; newhv = > hash_any(&blah, sizeof(blah)). > Sounds good. I've seen a post from Thomas Munro suggesting some alternative approach for combining hash values in execGrouping.c[1].
>> In addition to that it seems quite worthwhile to provide an iterator >> that's not vulnerable to this. An approach that I am, seemingly >> successfully, testing is to iterate the hashtable in multiple (in my >> case 23, because why not) passes, accessing only every nth element. That >> allows the data to be inserted in a lot less "dense" fashion. But >> that's more an optimization, so I'll just push something like the patch >> mentioned in the thread already. >> >> Makes some sense? > > Yep. > Yes, it makes sense. Quadratic probing is another approach, but it would require an extra shift op every time we want to find the next or prev location during a collision. [1] https://www.postgresql.org/message-id/caeepm%3d3rdgjfxw4ckvj0oemya2-34b0qhng1xv0vk7tgpjg...@mail.gmail.com -- Thanks & Regards, Kuntal Ghosh EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers