I caused this situation myself by not explicitly differentiating between
the default charset for the internal htmlspecialchars() and
htmlentities() functions and the output charset directive ini directive
default_charset.

The idea behind the default_charset ini directive was to act as the
charset that gets specified in the HTTP Content-type header if you do
not explicitly send your own Content-type header with the header()
function. This has been muddied a bit by the fact that
htmlspecialchars/htmlentities can take it into account when it is trying
to choose which encoding to use when handling data passed to it. This
isn't done by default since it actually makes little sense. It is only
done if you pass an empty string as the encoding argument. If you don't
pass anything at all the default is UTF-8 in 5.4. In 5.3 this was
ISO-8859-1.

And here is where the confusion comes in. We, myself included, have told
people that they can get the 5.3 behaviour back by setting the
default_charset ini directive to iso-8859-1. But, this is only true if
they are forcing htmlspecialchars/htmlentities to check that setting
with an empty string as the encoding arg. Most apps just do
htmlspecialchars($str) and nothing else. Plus, it is really not a good
idea to tie the internal encoding of data being passed to these
functions to the output charset. You should be able to change the output
charset without worrying about your runtime encoding at that level.

What this effectively means is that we are asking people to go through
their code and add an explicit charset to all htmlspecialchars() and
htmlentities() calls. I think this will be a hurdle for 5.4 adoption.

What we really need is what we added in PHP 6. A runtime encoding ini
setting that is distinct from the output charset which we can use here.
That would allow people to fix all their legacy code to a specific
runtime encoding with a single ini setting instead of changing thousands
of lines of code. I propose that we add such a directive to 5.4.1 to
ease migration.

See https://bugs.php.net/61354 for the first signs of grumbling about
this one. As more people migrate I have a feeling this will end up being
the most difficult part of the migration.

-Rasmus

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to