> > Should really theses functions discard the whole string for a single > > incomplete sequence ? > > I think since it is not possible to recover true content of the string, > it is ok to return failure value. Cutting it in random places or > ignoring problems doesn't seem a good idea - it might lead to all kinds > of nasty things, such as security filtering checking one data and > database getting entirely different data.
I dont think so. htmlspecialchars' job is to replace character sequences which may be interpreted as HTML special characters by the browser. Its job is not to validate a string or to check if it will be passed correctly to a DB. htmlspecialchars with my patch just achieves that. There are many chances to have an invalid unicode sequence in a user input. In normal situations, text typed in a form element will be sent in the correct encoding by the browser, but what about file uploads ? What if the browser itself send invalid sequences ? (e.g. copy/paste of word documents in a form and/or wysiwyg-enabled elements using IE). Bugs 43896, 43294 and 43549 also report theses problems. This new htmlspecialchars version will be a nightmare for many php users if it is left as is. -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php