I put an HTML santizing helper in WebHelpers dev.  It's
webhelpers.html.converters.sanitize(), defined in
webhelpers.html.render.  I'm not sure I'm satisfied with it though.
It strips all tags but leaves their content.

This would handle:
    I <i>really</i> like <script language="javascript"></script> steak!
=>
   I really like steak!


On the other hand it lets this through:
    I <i>really</i> like <script language="javascript">NEFARIOUS
CODE</script> steak!
=>
    I really like NEFARIOUS CODE steak


I'm not sure if that can be exploited.  It also doesn't resolve HTML
entities.  Should it?  Because the Javascript may have entities or raw
<'s meant for comparisions.

The HTML parser (Python's HTMLParser) lets raw <'s surrounded by
whitespace through:
    A < B
=>
    A < B

But raises a fit if it looks like an unfinished tag:
    A <B
=>
    HTMLParser.HTMLParseError: EOF in middle of construct, at line 1, column 3

This means we can't make a converter that handles all pathological
output without significant work.

I could strip the tag *and* the content, which would remove the
embedded Javascript but make users wonder where their <i> content
went, potentially leading to unreadable text.

PHP's strip_tags just strips the tags but leaves the content, so maybe
that's enough?

http://us2.php.net/manual/en/function.strip-tags.php

The manpage has this caveat:

    Because strip_tags() does not actually validate the HTML, partial,
or broken tags can result
    in the removal of more text/data than expected.

I've got several various patches implemented in WebHelpers tip.  I'll
probably release the beta in a few days, although I would like to give
it a proper manual before final.  But I'm still learning how to set
that up with Sphinx.

-- 
Mike Orr <sluggos...@gmail.com>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"pylons-devel" group.
To post to this group, send email to pylons-devel@googlegroups.com
To unsubscribe from this group, send email to 
pylons-devel+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/pylons-devel?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to