I would like to know if a patch to make strtolower and strtoupper do
plain ASCII case conversion would be accepted, or if an RFC should be
created.

The situation with case conversion is inconsistent.

The following functions do ASCII case conversion: strcasecmp,
strncasecmp, substr_compare.

The following functions do locale-dependent case conversion:
strtolower, strtoupper, str_ireplace, stristr, stripos, strripos,
strnatcasecmp, ucfirst, ucwords, lcfirst.

I would make them all do ASCII case conversion.

Developers need ASCII case conversion, because it is used internally
by PHP for things like class name comparison, and because it is a
specified algorithm in HTML 5 and related standards.

The existing options for ASCII case conversion are:

* Never call setlocale(). But this breaks non-ASCII characters in
escapeshellarg() and can't be guaranteed in a library.

* Call setlocale(LC_ALL, "C.UTF-8"). But this is non-portable and also
can't be guaranteed in a library.

* Use strtr(). But this is ugly and slow.

If mbstring has a way to do it, I can't find it. I tested
mb_strtolower($s, '8bit') and mb_strtolower($s,'ascii').

Note that locale-dependent case conversion is almost never a useful
feature. Strings are passed through tolower() one byte at a time, to
be interpreted with some legacy 8-bit character set. So the result
will typically be mojibake even if the correct locale is selected.

strtolower() mangles UTF-8 strings in many locales, such as fr-FR. I
made a full list at <https://phabricator.wikimedia.org/T291234>. The
UTF-8 locales mostly work, except for the Turkish ones, which mangle
ASCII strings.

At https://bugs.php.net/bug.php?id=67815 , Nikita Popov wrote: "My
general recommendation is to avoid locales and locale-dependent
functions, as locales are a fundamentally broken concept." I agree
with that. I think PHP should migrate away from locale dependence.
When PHP was young, it was convenient to use the C library, but we've
progressed well past that point now.

-- Tim Starling

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php

Reply via email to