Edit report at https://bugs.php.net/bug.php?id=47494&edit=1
ID: 47494
Comment by: lzsiga at freemail dot c3 dot hu
Reported by: philipp dot feigl at gmail dot com
Summary: htmlspecialchars does not throw E_WARNING on
multibyte problems
Status: Not a bug
Type: Feature/Change Request
Package: Strings related
Operating System: CentOS5
PHP Version: 5.2.8
Block user comment: N
Private report: N
New Comment:
Imho htmlspecialchars should not check for multi-byte validity at all, because
it only deals with a few characters that are all in ASCII7, so it could safely
ignore every byte between 0x80 and 0xFF. The third parameter could be simply
ignored (as if it were 'ISO-8859-1')
Previous Comments:
------------------------------------------------------------------------
[2012-08-30 19:21:49] [email protected]
@the disappointed user: PHP 5.4 no longer throws said warning (it was just
confusing). Instead there are several new options for dealing with incorrect
encoding. Of particular interest is ENT_SUBSTITUTE, which will replace invalid
code unit sequences with the Unicode Replacement Character (instead of
returning a rather unhelpful empty string). This way you can easily spot where
the string is incorrectly encoded. Furthermore this option has the additional
advantage of being more graceful (it just removed individual incorrectly
encoded bytes, not the whole string).
Hope this helps you. More info in the docs: http://de2.php.net/htmlspecialchars
------------------------------------------------------------------------
[2012-08-30 19:01:22] another_disappointed_php_programmer at exam
This is very sad.
This is a bug, and it's sad that PHP core developers said that it's a feature
and it won't be fixed. I'm disappointed.
------------------------------------------------------------------------
[2012-07-01 15:34:03] [email protected]
This really isn't a bug. I do agree that the approach isn't ideal, but we
shouldn't throw warnings on bad input here because htmlspecialchars() is
explicitly designed to clean up bad input and it is run directly on user data
most of the time. In order for someone to avoid this warning they would need to
first call something like iconv('utf-8','utf-8') to clean up the input data and
that doesn't make much sense since htmlspecialchars() essentially does that
already. But, in order to help debugging there should be some way to see why an
htmlspecialchars() call failed so a last_error() function similar to how it is
handled for json decoding would make sense.
------------------------------------------------------------------------
[2012-07-01 15:12:31] chris at cbsinteractive dot com
Happening our production servers, can replicate, PHP 5.3.10, Centos 5.6
------------------------------------------------------------------------
[2011-09-27 22:43:02] rudd-o at rudd-o dot com
Reported to /r/lolphp here:
http://www.reddit.com/r/lolphp/comments/kso6p/if_error_reporting_is_on_htmlspecia
lchars_will/
Do you guys realize there is an ENTIRE COMMUNITY of people devoted to the
collective practice of FACEPALMING at PHP's fails?
Hahaha.
------------------------------------------------------------------------
The remainder of the comments for this report are too long. To view
the rest of the comments, please view the bug report online at
https://bugs.php.net/bug.php?id=47494
--
Edit this bug report at https://bugs.php.net/bug.php?id=47494&edit=1