On Thu, Jan 2, 2020 at 6:42 AM Robert Haas <robertmh...@gmail.com> wrote: > On Mon, Dec 30, 2019 at 6:58 PM Peter Geoghegan <p...@bowt.ie> wrote: > > I propose that we adopt the following definition: For an operator > > class to be safe, its equality operator has to always agree with > > datum_image_eq() (i.e. two datums must be bitwise equal after > > detoasting). > > I suggested using datumIsEqual() as the canonical definition. (I > wonder why datum_image_eq() does not reuse that function?)
The difference between datum_image_eq() and datumIsEqual() is that only the former will consider two datums equal when they happen to have different TOAST input states -- we need that here. datumIsEqual() avoids doing this because sometimes it needs to work for callers operating within an aborted transaction. datum_image_eq() was originally used for the "*=, *<>, *<, *<=, *>, and *>=" rowtype B-Tree operator class needed by REFRESH MATERIALIZED VIEW CONCURRENTLY. (Actually, that's not quite true, since datum_image_eq() is a spin-off of the rowtype code that was added much more recently to fix a bug in foreign keys.) The B-Tree code and amcheck need to be tolerant of inconsistent TOAST input states. This isn't particularly likely to happen, but it would be hard to revoke the general assumption that that's okay now. Also, it's not that hard to deal with it directly. For example, we're not reliant on equal index tuples all being the same size in the deduplication patch. -- Peter Geoghegan