Hi, The htmlspecialchars and htmlentities functions since version 5.2.5 return an empty string when the input contains at least a single invalid or incomplete unicode sequence.
What I understood is that this change was made to avoid reading more chars in the buffer than it actually contained. Should really theses functions discard the whole string for a single incomplete sequence ? I made a patch which changes the behavior of these functions to skip invalid sequences, without discarding the whole string. This involves a very few changes and makes the behavior of theses functions more consistent with previous PHP versions, keeping the fixes that was made in the get_next_char() internal function. The patch: http://s3.amazonaws.com/arnaud.lb/php_htmlentities_utf.patch The bug entry: http://bugs.php.net/bug.php?id=43896 -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php