Re: [PATCH] Refactor *_abbrev_convert() functions

John Naylor Mon, 23 Feb 2026 01:40:54 -0800

On Tue, Feb 3, 2026 at 9:56 PM Aleksander Alekseev
<[email protected]> wrote:
> > There's more we can do here. Above the stanzas changed in the patch
> > there is this, at least for varlena/bytea:
> >
> > hash = DatumGetUInt32(hash_any((unsigned char *) authoritative_data,
> >                       Min(len, PG_CACHE_LINE_SIZE)));
> >
> > This makes no sense to me: hash_any() calls hash_bytes() and turns the
> > result into a Datum, and then we just get it right back out of the
> > Datum again.
>
> I see similar patterns in files other than bytea.c and varlena.c.
> Implemented as a separate patch.


I think it makes sense to squash 0001 and 0003 together, then 0002 and
0004 together.

For the first, we should probably combine in the upper half when using
a 64-bit hash, like this:

     /* Hash abbreviated key */
     {
-        uint32          tmp;
+        uint64          tmp;

-        tmp = DatumGetUInt32(res) ^ (uint32) (DatumGetUInt64(res) >> 32);
-        hash = DatumGetUInt32(hash_uint32(tmp));
+        tmp = murmurhash64(DatumGetUInt64(res));
+        hash = (uint32) tmp ^ tmp >> 32;
     }

> Using hash_bytes_uint32() / hash_bytes_uint32_extended() directly in
> timetz_hash() / timetz_hash_extended() is safe though. Proposed as a
> separate patch.

0005 doesn't buy us as much in readability since the two lines no longer match.

Further cleanup possible now that we have 64-bit datums: MAC addresses
are always 6 bytes, so abbreviation is no longer relevant -- datum1 is
authoritative. That's in scope for the thread subject but also a
bigger patch, but maybe someone would like to pick it up for PG20.

--
John Naylor
Amazon Web Services

Re: [PATCH] Refactor *_abbrev_convert() functions

Reply via email to