> > Should really theses functions discard the whole string for a single 
> > incomplete sequence ?
> 
> I think since it is not possible to recover true content of the string, 
> it is ok to return failure value. Cutting it in random places or 
> ignoring problems doesn't seem a good idea - it might lead to all kinds 
> of nasty things, such as security filtering checking one data and 
> database getting entirely different data.

I dont think so. htmlspecialchars' job is to replace character sequences which 
may be interpreted as HTML special characters by the browser. Its job is not 
to validate a string or to check if it will be passed correctly to a DB.

htmlspecialchars with my patch just achieves that.

There are many chances to have an invalid unicode sequence in a user input. In 
normal situations, text typed in a form element will be sent in the correct 
encoding by the browser, but what about file uploads ? What if the browser 
itself send invalid sequences ? (e.g. copy/paste of word documents in a form 
and/or wysiwyg-enabled elements using IE). Bugs 43896, 43294 and 43549 also 
report theses problems.

This new htmlspecialchars version will be a nightmare for many php users if it 
is left as is.

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to