Re: [HACKERS] PATCH: CITEXT 2.0

David E. Wheeler Mon, 07 Jul 2008 13:15:23 -0700

On Jul 7, 2008, at 12:46, Zdenek Kotala wrote:

So, the upshot is that the = and <> operators are not locale-aware,yes? They just do byte comparisons. Is that really the way itshould be? I mean, could there not be strings that are equivalentbut have different bytes?
Correct. The problem is complex. It works fine only for normalizedstring. But postgres now assume that all utf8 strings are normalized.


I see. So binary equivalence is okay, in that case.

If you need to implement < <= >= > operators you need to use strcolwhich take care of locale collation.


Which varstr_cmp() does, I guess. It's what textlt uses, for example.

See unicode collation algorithm http://www.unicode.org/reports/tr10/


Wow, that looks like a fun read.

Best,

David


--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PATCH: CITEXT 2.0

Reply via email to