Thanks a million for your answers, Martijn. I just have some more stupid questions, if you could bear with me.

On Jun 26, 2008, at 03:28, Martijn van Oosterhout wrote:

When creating an index, your comparison functions are going ot be
called O(N log N) times. If they leak into a context that isn't
regularly freed you may have a problem. I'd suggest loking at how the
text comparisons do it. PG_FREE_IF_COPY() is probably a good idea
because the incoming tuples may be detoasted.

Okay, I see that text_cmp in varlena doesn't use PG_FREE_IF_COPY(), and neither do text_smaller nor text_larger (which just dispatch to text_cmp anyway).

The operator functions *do* use PG_FREE_IF_COPY(). So I'm guessing it's these functions you're talking about. However, my implementation just looks like this:

Datum citext_ne (PG_FUNCTION_ARGS) {
// Fast path for different-length inputs. Okay for canonical equivalence?
    if (VARSIZE(PG_GETARG_TEXT_P(0)) != VARSIZE(PG_GETARG_TEXT_P(1)))
        PG_RETURN_BOOL( 1 );
    PG_RETURN_BOOL( citextcmp( PG_ARGS ) != 0 );
}

I don't *thinkI any variables are copied there. citextcmp() is just this:

int citextcmp (PG_FUNCTION_ARGS) {
    // XXX These are all just references to existing structures, right?
    text * left  = PG_GETARG_TEXT_P(0);
    text * right = PG_GETARG_TEXT_P(1);
    return varstr_cmp(
        cilower( left ),
        VARSIZE_ANY_EXHDR(left),
        cilower( right ),
        VARSIZE_ANY_EXHDR(right)
    );
}

Again, no copying. cilower() does copy:

    int    index, len;
    char * result;

    index  = 0;
    len    = VARSIZE(arg) - VARHDRSZ;
    result = (char *) palloc( strlen( str ) + 1 );

    for (index = 0; index <= len; index++) {
        result[index] = tolower((unsigned char) str[index] );
    }
    // XXX I don't need to pfree result if I'm returning it, right?
    return result;

But the copied value is returned. Hrm…it should probably be pfreed somewhere, yes?

So I'm wondering if I should change citextcmp to pfree values? Something like this:

    text * left  = PG_GETARG_TEXT_P(0);
    text * right = PG_GETARG_TEXT_P(1);
    char * lcstr = cilower( left  );
    char * rcstr = cilower( right );

    int result = varstr_cmp(
        cilower( left ),
        VARSIZE_ANY_EXHDR(left),
        cilower( right ),
        VARSIZE_ANY_EXHDR(right)
    );

    pfree( lcstr );
    pfree( rcstr );
    return result;

This is the only function that calls cilower(). And I *think* it's the only place where values are copied or memory is allocated needing to be freed. Does that sound right to you?

On a side note, I've implemented this pretty differently from how the text functions are implemented in varlena.c, just to try to keep things succinct. But I'm wondering now if I shouldn't switch back to the style used by varlena.c, if only to keep the style the same, and thus perhaps to increase the chances that citext would be a welcome contrib addition. Thoughts?

Many thanks again. You're a great help to this C n00b.

Best,

David
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to