Re: [HACKERS] PATCH: CITEXT 2.0

Zdenek Kotala Mon, 07 Jul 2008 12:50:30 -0700

David E. Wheeler napsal(a):

On Jul 7, 2008, at 12:21, David E. Wheeler wrote:
My question is: why? Shouldn't they all use the same function forcomparison? I'm happy to dupe this implementation for citext, but Idon't understand it. Should not all comparisons be executed consistently?
Let me try to answer my own question by citing this comment:

    /*
* Since we only care about equality or not-equality, we can avoidall the
     * expense of strcoll() here, and just do bitwise comparison.
     */
So, the upshot is that the = and <> operators are not locale-aware, yes?They just do byte comparisons. Is that really the way it should be? Imean, could there not be strings that are equivalent but have differentbytes?

Correct. The problem is complex. It works fine only for normalized string. Butpostgres now assume that all utf8 strings are normalized.

If you need to implement < <= >= > operators you need to use strcol which takecare of locale collation.


See unicode collation algorithm http://www.unicode.org/reports/tr10/

                Zdenek




--
Zdenek Kotala              Sun Microsystems
Prague, Czech Republic     http://sun.com/postgresql


--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PATCH: CITEXT 2.0

Reply via email to