Hi, > On Apr 23, 2026, at 07:33, David Rowley <[email protected]> wrote: > > Yeah, it doesn't make sense to repeatedly perform a linear search over > the array to check if NULL matches anything in the array. Let's just > do that once when we build the hash table and reuse that cached value > whenever we see a NULL. We can skip that step with strict functions > since we'll short-circuit earlier. > > A patch for that is attached.
Thanks for working on this. Overall, this version looks good to me, and I'm fine with the current approach. One possible improvement, though not a blocker, would be to defer the lhs-NULL handling until we actually encounter the first NULL on the lhs. That could avoid a bit of extra work in the common case where the lhs contains no NULLs. That said, I think the current implementation is perfectly OK as-is. > IMO it's unrealistic to assume we can do anything sane with an > equality function that always returns NULL. > > I really doubt it's worth troubling over that. If we did want to do > something, then it would be more efficient to probe the hash table > directly after we insert a Datum and verify we can find it again. If > we can't find any value we just inserted, mark the entire table as > broken and have it so we check for that and do a linear search. I tend to agree. Even if such a case can be constructed, it seems rare enough that I am not sure it is worth adding more complexity, or extra overhead in the common hashed SAOP path, to handle it in this patch. I think we can revisit that separately if a concrete case turns up that seems worth looking into. -- Best regards, Chengpeng Yan
