On Sat, Sep 24, 2016 at 1:03 AM, Amit Kapila <amit.kapil...@gmail.com> wrote:
> On Sat, Sep 24, 2016 at 1:02 AM, Robert Haas <robertmh...@gmail.com> wrote:
>> Currently, hash indexes always store the hash code in the index, but
>> not the actual Datum. It's recently been noted that this can make a
>> hash index smaller than the corresponding btree index would be if the
>> column is wide. However, if the index is being built on a fixed-width
>> column with a typlen <= sizeof(Datum), we could store the original
>> value in the hash index rather than the hash code without using any
>> more space. That would complicate the code, but I bet it would be
>> faster: we wouldn't need to set xs_recheck, we could rule out hash
>> collisions without visiting the heap, and we could support index-only
>> scans in such cases.
> What exactly you mean by Datum? Is it for datatypes that fits into 64
> bits like integer.
Yeah, I mean whatever is small enough to fit into the space currently
being used to store the hashcode, along with any accompanying padding
bytes that we can also use.
> I think if we are able to support index only scans
> for hash indexes for some data types, that will be a huge plus.
> Surely, there is some benefit without index only scans as well, which
> is we can avoid recheck, but not sure if that alone can give us any
> big performance boost. As, you say, it might lead to some
> complication in code, but I think it is worth trying.
Yeah, the recheck is probably not that expensive if we have to
retrieve the heap page anyway.
> Won't it add some requirements for pg_upgrade as well?
I have nothing to add to what Bruce already said.
The Enterprise PostgreSQL Company
Sent via pgsql-hackers mailing list (firstname.lastname@example.org)
To make changes to your subscription: