Yeah, I got that the reason is linguistic in its origin. It is great
when trying to search a mass of text. But when you try to do a
matching search for an exact string it does complicate things a lot
when you still think that = really means exactly equal.

Doing WHERE username = 'myname' I (as a programmer) never ever want to
match anything else but exactly that.
Doing WHERE article LIKE '%cake%' I would not at all be this critical
or surprised since it is a different kind of searching in my world.

Also, I was under the mistaken impression that COLLATE was ONLY
related to how to sort these special characters. This I have not
problem with either btw. Previously, I had no idea that collation also
affected simple matching searches.

The equal sign has a special place in my heart. :)
I guess the binary collation will be my preference for general data.

Do you have any advice for a web-application with multiple languages?
You can only take advantage of the linguistic advantages as long as
the language in the data and the collation match. How would cater for,
say, a blog in both german and french? Set the database defaults to
general or binary and then add COLLATE utf8_french_ci to the queries?


On Jun 13, 4:28 pm, "Jonathan Snook" <[EMAIL PROTECTED]> wrote:
> > A am a bit shocked that it is a "feature" when å is the same as a in
> > MySQL. That sounds just plain wrong to me. If it had been so for
> > utf8_some_special_ci, fine, but not for general (the default default)
> > collations. To me that would be like PHP saying (1 == 1.2) is true
> > because it is "close enough". :) Very strange but I guess they must
> > have some very good reason for it.
> It's not really the same thing and yes, there's a very good reason.
> Most languages, diacritics are meant to alter the pronunciation of a
> letter. In other words, e, é and è are the same "letter" but have
> different pronunciations because of the accent marks. Therefore, when
> a French person does a search for a word, they might simply type in
> "ecole" but they fully expect école to show up. Another example, I
> live in a city known as Orléans but has been known as Orleans (note
> the lack of accent) for a number of years (they only recently added
> the accent back in where it belongs). However, a search for Orleans
> should bring up either result. Also, collations determine how content
> is ordered when results are returned. Take Ecole A, ecole B, École C
> and école D. How should that be ordered? The _ci indicates
> case-insensitive so we get the order we expect (as I've listed). It'd
> be pretty confusing to do a search and get ecole B, Ecole A, [the rest
> of the latin character results], école D, École C.
> I hope that explains it a little better.
You received this message because you are subscribed to the Google Groups 
"CakePHP" group.
To post to this group, send email to
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at

Reply via email to