Re: ExecBuildGroupingEqual versus collations

Andres Freund Fri, 14 Dec 2018 12:43:48 -0800

Hi,

On 2018-12-14 14:25:30 -0500, Tom Lane wrote:
> In pursuit of places that might not be on board with non-default
> collations, I tried sticking
> 
>     Assert(PG_GET_COLLATION() == C_COLLATION_OID);
> 
> into nameeq() and the other name comparison functions, in my build with
> type name's typcollation changed to "C".  I'm down to one place in the
> core regression tests that falls over because of that:
> ExecBuildGroupingEqual() figures it can get away with
> 
>         InitFunctionCallInfoData(*fcinfo, finfo, 2,
>                                  InvalidOid, NULL, NULL);
> 
> i.e. passing collation = InvalidOid to the type-specific equality
> functions.


Yea, I just cargo-culted that behaviour forward, the code previously
did:

bool
execTuplesMatch(TupleTableSlot *slot1,
                TupleTableSlot *slot2,
                int numCols,
                AttrNumber *matchColIdx,
                FmgrInfo *eqfunctions,
                MemoryContext evalContext)
...
        /* Apply the type-specific equality function */

        if (!DatumGetBool(FunctionCall2(&eqfunctions[i],
                                        attr1, attr2)))
        {
            result = false;     /* they aren't equal */
            break;
        }


> Now, it's certainly true that nameeq() doesn't need a collation spec
> today, any more than texteq() does, because both types legislate that
> equality is bitwise.  But if we leave ExecBuildGroupingEqual like this,
> we're mandating that no type anywhere, ever, can have a
> collation-dependent notion of equality.  Is that really a restriction
> we're comfortable with?  citext is sort of the poster child here,
> because it already wishes it could have collation-dependent equality.

Don't we rely on that in other places too?


> We'd have to do some work to pass a reasonable collation choice through
> to this code from wherever we have the info available, but I don't
> think it'd be a huge amount of work.

Yea, I think it shouldn't be too hard for most callers.

I think one issue here is that it's not clear how it'd be sensible to
have collation dependent equality for cases where we don't have a real
way to override the collation for an operation.  It's not too hard to
imagine adding something to GROUP BY, but set operations seem harder.


> BTW, this case is perhaps a bit of a counterexample to Peter's idea about
> doing collation lookups earlier: if we change this, we'd presumably have
> to do collation lookup right here, even though we know very well that
> common types' equality functions won't use the information.

Hm.

Greetings,

Andres Freund

Re: ExecBuildGroupingEqual versus collations

Reply via email to