On Fri, Aug 5, 2016 at 10:39 AM, Kenneth Marshall <k...@rice.edu> wrote: > I have been following the recent discussions on increasing the > size of the hash function used in Postgres and the work to > provide WAL and performance improvements for hash indexes. > I know it was mentioned when we moved to the new hashing > functions, but the existing functions do provide an additional > 32-bits of hash. We currently do not use them, but they are > already calculated. > > It had occurred to me that one way to decrease the space used > to store the hash value would be to include information about > the page number to determine the actual value. For example, > a hash index of 65k pages (540mb) would get two additional > bytes of hash with no associated storage cost. Also, if you > subdivided the hash page into say 128 sub-pages you would > get the extra 2 bytes of hash in a 4mb index. In this way, > the bigger the hash index is, the more bits you get for free. > > Just wanted to throw it out there.
I'm not sure I understand exactly what you are proposing here. Suppose we have 64 bits of hashcode and (1 << N) buckets. Currently, we store hashcode bits 0..31 on each item. Maybe what you're saying is that we could instead store bits B..(31+B) on each item - that is, cram in as many extra bits on each individual item as log2(nbuckets). The problem with that is that it would make it very hard to split buckets. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers