On Sat, Sep 24, 2016 at 1:03 AM, Amit Kapila <amit.kapil...@gmail.com> wrote:
> On Sat, Sep 24, 2016 at 1:02 AM, Robert Haas <robertmh...@gmail.com> wrote:
>> Currently, hash indexes always store the hash code in the index, but
>> not the actual Datum.  It's recently been noted that this can make a
>> hash index smaller than the corresponding btree index would be if the
>> column is wide.  However, if the index is being built on a fixed-width
>> column with a typlen <= sizeof(Datum), we could store the original
>> value in the hash index rather than the hash code without using any
>> more space.  That would complicate the code, but I bet it would be
>> faster: we wouldn't need to set xs_recheck, we could rule out hash
>> collisions without visiting the heap, and we could support index-only
>> scans in such cases.
> What exactly you mean by Datum?  Is it for datatypes that fits into 64
> bits like integer.

Yeah, I mean whatever is small enough to fit into the space currently
being used to store the hashcode, along with any accompanying padding
bytes that we can also use.

> I think if we are able to support index only scans
> for hash indexes for some data types, that will be a huge plus.
> Surely, there is some benefit without index only scans as well, which
> is we can avoid recheck, but not sure if that alone can give us any
> big performance boost.  As, you say, it might lead to some
> complication in code, but I think it is worth trying.

Yeah, the recheck is probably not that expensive if we have to
retrieve the heap page anyway.

> Won't it add some requirements for pg_upgrade as well?

I have nothing to add to what Bruce already said.

Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to