Re: [PHP] When is "z" != "z" ?

Rasmus Lerdorf Mon, 05 Jun 2006 18:55:21 -0700

tedd wrote:

For example, the Unicode issue was raised during this discussion -- if php 
doesn't consider the numeric relationship of characters, then I see a big 
problem waiting in the wings. Because if we're having these types of 
discussions with just considering 00-7F characters, then I can only guess at 
what's going to happen when we start considering 000000-FFFFFF code-points.


Now, was that enough said?  :-)

I don't think you really understand this. < and > are collationoperators when they operate on strings. They have absolutely nothing todo with the numeric values of the characters. It just so happens thatin English iso-8859-1 there is a 1:1 relationship between the numericvalues and the collation order, but you can think of that as dumb luck.


To better understand this, I suggest you start reading here:

  http://icu.sourceforge.net/userguide/Collate_Intro.html

Note one of the points on that page. That in Lithuanian 'y' fallsbetween 'i' and 'k'. So even without going into Unicode and just usinglow-ascii, you have these issues.

Now, until we get to PHP 6, we don't have decent Unicode support and wedon't have LOCALE-aware operators. You will have to manually usestrcoll() to get them, but that is going to change and you will have theICU collation algorithms available and for Unicode strings it will beautomatic. You can still have binary-strings if you don't wantlocale-aware collation, of course.


-Rasmus

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP] When is "z" != "z" ?

Reply via email to