On Jun 1, 2008, at 21:08, Tom Lane wrote:

"David E. Wheeler" <[EMAIL PROTECTED]> writes:
I really need case-insensitive string comparison in my database.

Okay ... according to whose locale?

I'm using C. Of course you're correct that it depends on the locale, I always forget that. But does not the Unicode standard offer up some sort locale-independent case-insensitivity, so that it gets it right some large percentage of the time?

Ideally there'd be a nice ITEXT data type (and friends, ichar,
ivarchar, etc.). But of course there isn't, and for years I've just
used LOWER() on indexes and queries to get the same result.

Only it turns out that I'm of course not getting the same result.

I think that means you're not using the right locale.

What locale is right? If I have a Web app, there could be data in many different languages in a single table/column.

1. Does the use of the tolower() C function in the citext data type on
pgfoundry basically give me the same results as using lower() in my
SQL has for all these years?

[ broken record... ] Kinda depends on your locale. However, tolower()
is 100% guaranteed not to work for multibyte encodings, so citext is
quite useless if you're using UTF8.  This is fixable, no doubt, but
it's not fixed in the project as it stands.

Right, okay; thanks. I'm thinking about using it for email addresses and domain names, however, so it might be adequate for those applications.

2. Isn't the ICU library distributed with PostgreSQL?

Nope, it is not, and we have already pretty much determined that we
do not want to make Postgres depend on ICU.  See the archives.

Damn. Okay, thanks.

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to