tedd wrote:
For example, the Unicode issue was raised during this discussion -- if php 
doesn't consider the numeric relationship of characters, then I see a big 
problem waiting in the wings. Because if we're having these types of 
discussions with just considering 00-7F characters, then I can only guess at 
what's going to happen when we start considering 000000-FFFFFF code-points.

Now, was that enough said?  :-)

I don't think you really understand this. < and > are collation operators when they operate on strings. They have absolutely nothing to do with the numeric values of the characters. It just so happens that in English iso-8859-1 there is a 1:1 relationship between the numeric values and the collation order, but you can think of that as dumb luck.

To better understand this, I suggest you start reading here:

  http://icu.sourceforge.net/userguide/Collate_Intro.html

Note one of the points on that page. That in Lithuanian 'y' falls between 'i' and 'k'. So even without going into Unicode and just using low-ascii, you have these issues.

Now, until we get to PHP 6, we don't have decent Unicode support and we don't have LOCALE-aware operators. You will have to manually use strcoll() to get them, but that is going to change and you will have the ICU collation algorithms available and for Unicode strings it will be automatic. You can still have binary-strings if you don't want locale-aware collation, of course.

-Rasmus

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to