Edit report at https://bugs.php.net/bug.php?id=47494&edit=1
ID: 47494
Comment by: user at dudmail dot com
Reported by: philipp dot feigl at gmail dot com
Summary: htmlspecialchars does not throw E_WARNING on
multibyte problems
Status: Not a bug
Type: Feature/Change Request
Package: Strings related
Operating System: CentOS5
PHP Version: 5.2.8
Block user comment: N
Private report: N
New Comment:
Not showing with display_errors = 1 to avoid leaks on badly configured servers,
while showing and thus leaking sensitive information with properly configured
servers? This is lame.
Previous Comments:
------------------------------------------------------------------------
[2012-09-13 18:53:41] lzsiga at freemail dot c3 dot hu
It would be a valid reason, if there were any plan to support utf16/32, as
iso-8859-x and utf-8 are ASCII-compatible. But even then, the default value for
the $encoding parameter still could be 'ascii(or compatible)'.
Or, like some other string operations, there could be a mb_htmlspecialchars
function.
------------------------------------------------------------------------
[2012-09-13 17:25:08] [email protected]
By simple I assume you mean an htmlspecialchars() function that doesn't check
the
validity of the characters. The problem is that we have to do that. We can't
encode characters without understanding which charset we are dealing with and
we
need to make sure that the character we are looking at is a valid one. The
world
has moved beyond 7-bit ASCII, sorry.
------------------------------------------------------------------------
[2012-09-13 17:07:47] lzsiga at freemail dot c3 dot hu
If the name of the function were
'check_for_multibyte_validity_and_htmlspecialchars' then you'd be right, but
even then I'd lobby for a simple 'htmlspecialchars' function... Doing something
(ie multibyte validity check) that the user (the PHP-programmer in this case)
didn't specifically ask doesn't seem to me to be a good idea (see magic_quotes
for another example).
PS: Of course I wouldn't complaining (or even know about the whole question) if
the default value hadn't been changed to 'UTF-8' in 5.4.
------------------------------------------------------------------------
[2012-09-06 15:33:13] [email protected]
Also note that many, if not most, apps use this as their only validity filter
and
if you output invalid UTF-8, for example, it can lead to security problems like
the well-known IE 0xE0 XSS exploit. So at some point along the line you have to
do a multi-byte check and it may as well be here since we need to do it anyway.
------------------------------------------------------------------------
[2012-09-06 15:29:07] [email protected]
You assume ASCII7 compatibility for all encodings which is a bad assumption.
------------------------------------------------------------------------
The remainder of the comments for this report are too long. To view
the rest of the comments, please view the bug report online at
https://bugs.php.net/bug.php?id=47494
--
Edit this bug report at https://bugs.php.net/bug.php?id=47494&edit=1