ID: 37945 Updated by: [EMAIL PROTECTED] Reported By: andreas dot schmidt at stasy dot de -Status: Open +Status: Closed Bug Type: *Directory/Filesystem functions Operating System: Linux PHP Version: 5.1.4 Assigned To: moriyoshi New Comment:
This bug has been fixed in CVS. Snapshots of the sources are packaged every three hours; this change will be in the next snapshot. You can grab the snapshot at http://snaps.php.net/. Thank you for the report, and for helping us make PHP better. Previous Comments: ------------------------------------------------------------------------ [2006-07-03 07:13:24] [EMAIL PROTECTED] There still seems to be a regression with utf8 and PHP-5: [EMAIL PROTECTED]:~$ php -r 'var_dump(pathinfo("/usr/bin/äggi")); echo `printenv|egrep "LANG|LC"`;'; array(2) { ["dirname"]=> string(8) "/usr/bin" ["basename"]=> string(3) "ggi" } LANG=en_GB.UTF-8 LANGUAGE=en_GB:en [EMAIL PROTECTED]:~$ php -r 'setlocale(LC_ALL, getenv("LANG")); var_dump(pathinfo("/usr/bin/äggi")); echo `printenv|egrep "LANG|LC"`;'; array(2) { ["dirname"]=> string(8) "/usr/bin" ["basename"]=> string(3) "ggi" } LANG=en_GB.UTF-8 LANGUAGE=en_GB:en [EMAIL PROTECTED]:~$ php -r 'var_dump(pathinfo("/usr/bin/äßßü")); echo `printenv|egrep "LANG|LC"`;'; array(2) { ["dirname"]=> string(8) "/usr/bin" ["basename"]=> string(3) "bin" } LANG=en_GB.UTF-8 LANGUAGE=en_GB:en Same on latin1 terminal: s1-iw:~$ php -r 'var_dump(pathinfo("/usr/bin/äggi")); echo `printenv|egrep "LANG|LC"`;'; array(2) { ["dirname"]=> string(8) "/usr/bin" ["basename"]=> string(4) "äggi" } LC_ALL=en_US.iso-8859-1 LANG=en_US.iso-8859-1 s1-iw:~$ php -r 'setlocale(LC_ALL, getenv("LANG")); var_dump(pathinfo("/usr/bin/äggi")); echo `printenv|egrep "LANG|LC"`;'; array(2) { ["dirname"]=> string(8) "/usr/bin" ["basename"]=> string(4) "äggi" } LC_ALL=en_US.iso-8859-1 LANG=en_US.iso-8859-1 s1-iw:~$ php -r 'var_dump(pathinfo("/usr/bin/äßßü")); echo `printenv|egrep "LANG|LC"`;'; array(2) { ["dirname"]=> string(8) "/usr/bin" ["basename"]=> string(4) "äßßü" } LC_ALL=en_US.iso-8859-1 LANG=en_US.iso-8859-1 Everything also works well with PHP-4 in an utf8 console. ------------------------------------------------------------------------ [2006-07-02 22:30:31] [EMAIL PROTECTED] Set the correct value to the LANG (or LC_CTYPE, if necessary) environment variables. The function expects your filesystem's locale to be the same as the one given by the environment variable. Up to this point you have to set up the libc's locale data too. If you were to use pathinfo() / dirname() / basename() on URI's, just don't do that. these are not designed to use for such a purpose. ------------------------------------------------------------------------ [2006-06-29 15:25:28] [EMAIL PROTECTED] This, in deed, seems to be a regression introduced in PHP5. http://cvs.php.net/viewvc.cgi/php-src/ext/standard/string.c?r1=1.405&r2=1.406&pathrev=PHP_5_2 ------------------------------------------------------------------------ [2006-06-28 11:20:26] andreas dot schmidt at stasy dot de This bug has nothing to do with Unicode!!! The bug occurs when special characters like äöüé are used. These characters are part of the ISO-8859-1 character set! ------------------------------------------------------------------------ [2006-06-28 11:08:36] [EMAIL PROTECTED] Unicode support will appear only in PHP6, you have to wait until then. ------------------------------------------------------------------------ The remainder of the comments for this report are too long. To view the rest of the comments, please view the bug report online at http://bugs.php.net/37945 -- Edit this bug report at http://bugs.php.net/?id=37945&edit=1