ID: 48322 Updated by: j...@php.net Reported By: netspy at me dot com -Status: Open +Status: Wont fix Bug Type: *Unicode Issues Operating System: Mac OS X PHP Version: 5.2.9 New Comment:
I get the wrong order on Linux. Did you mix the results there? Anyways, this really is a problem in unicode support. To get _really_ working stuff, use the intl extension or wait for PHP 6. Wont fix. Previous Comments: ------------------------------------------------------------------------ [2009-05-19 12:35:29] netspy at me dot com On Linux strcoll works fine, I get only on Mac OS X (BSD) a false order. I also test it with a ISO 8859-1 string and locale de_DE.ISO8859-1. The same result, on Linux correct, on Mac OS X wrong. So I think it's not a Unicode issue! Here is another test code: $string_utf = "abcdefghijklmnopqrstuvwxyzäöüß"; $string_iso = utf8_decode($string_utf); $array_utf = array(); $array_iso = array(); for ($i=0; $i<mb_strlen($string_utf, 'UTF-8'); $i++) { $array_utf[]=mb_substr($string_utf, $i, 1, 'UTF-8'); $array_iso[]=substr($string_iso, $i, 1); } print("\nLocale: " . setlocale(LC_COLLATE, 'de_DE.UTF-8')); usort($array_utf, 'strcoll'); print("\n" . implode('', $array_utf) . "\n"); print("\nLocale: " . setlocale(LC_COLLATE, 'de_DE.ISO8859-1')); usort($array_iso, 'strcoll'); print("\n" . utf8_encode(implode('', $array_iso)) . "\n"); The result on Mac OS X: Locale: de_DE.UTF-8 abcdefghijklmnopqrstuvwxyzßäöü Locale: de_DE.ISO8859-1 abcdefghijklmnopqrstuvwxyzßäöü And the Linux result: Locale: de_DE.UTF-8 aäbcdefghijklmnoöpqrsßtuüvwxyz Locale: de_DE.ISO8859-1 aäbcdefghijklmnoöpqrsßtuüvwxyz ------------------------------------------------------------------------ [2009-05-19 10:50:59] j...@php.net It doesn't work on any system below PHP 6. You can always use the intl extension from PECL while waiting for proper unicode support: http://pecl.php.net/intl Using the collator (http://php.net/collator) you can achieve sorting with any locales. ------------------------------------------------------------------------ [2009-05-18 22:37:22] netspy at me dot com Description: ------------ strcoll() does not sort UTF-8 strings correctly on Mac OS X. Reproduce code: --------------- $locale = 'de_DE.UTF-8'; $string = "abcdefghijklmnopqrstuvwxyzäöüß"; $array = array(); for ($i=0; $i<mb_strlen($string, 'UTF-8'); $i++) { $array[]=mb_substr($string, $i, 1, 'UTF-8'); } $oldLocale = setlocale(LC_COLLATE, "0"); print("\nOld: $oldLocale New: "); print(setlocale(LC_COLLATE, $locale)); usort($array, 'strcoll'); setlocale(LC_COLLATE, $oldLocale); print("\n" . implode('', $array) . "\n"); Expected result: ---------------- Old: C New: de_DE.UTF-8 aäbcdefghijklmnoöpqrsßtuüvwxyz Actual result: -------------- Old: C New: de_DE.UTF-8 abcdefghijklmnopqrstuvwxyzßäöü ------------------------------------------------------------------------ -- Edit this bug report at http://bugs.php.net/?id=48322&edit=1