Hi all,

Further to a discussion we had yesterday about the danger of onLoad,
onMouseOver, etc etc of allowed tags when using strip_tags(), I've decided
to look at the issue from another angle.

For the limited set of tags I usually allow on user input, <B><I><U>, I'm
going take the approach of deleting anything I don't specifically TRUST,
rather than deleting things I don't trust.

For such simple tags, It seems to me to be a smarter move to delete anything
in the tag apart from the actual tag.

<B"anything else"> becomes <B>

This eliminates the danger of people putting anything evil like
onmouseover="javascript:self.close();" into my small set of allowed tags.


So, I'd like a regexp which looks for multiple occurences of a tag (let's
take <B> for an example), and throw out anything not needed.

In English, I guess it looks like:

look for a "<" followed by a "b" (case insensitive), then throw away
anything up to the first ">" we find.

Better still would be a regexp or function that checks for b|i|u, or a
passed set of tags.


I'm aware that this type of hard-line approach will prevent <B id="foo">,
and I will also have problems on things like <FONT face="something"> and <A
HREF="foo.php">, but I plan to devise some psuedo tags for links, and don't
require font tags, image tags, etc etc.


Many thanks in advance,

Justin French


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to