Hi,

The htmlspecialchars and htmlentities functions since version 5.2.5 return an 
empty string when the input contains at least a single invalid or incomplete 
unicode sequence.

What I understood is that this change was made to avoid reading more chars in 
the buffer than it actually contained. 

Should really theses functions discard the whole string for a single 
incomplete sequence ?

I made a patch which changes the behavior of these functions to skip invalid 
sequences, without discarding the whole string. This involves a very few 
changes and makes the behavior of theses functions more 
consistent with previous PHP versions, keeping the fixes that was made in the 
get_next_char() internal function.

The patch: http://s3.amazonaws.com/arnaud.lb/php_htmlentities_utf.patch
The bug entry: http://bugs.php.net/bug.php?id=43896

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to