Jonathan Vanasco wrote:
> I need to sanitize user input for 'comments' and 'postings'.
> 
> Can anyone suggest good ways to handle this?
> 
> Browsing the web and other projects, it seems most people do this:
> - use beautiful soup ( which i think might be overkill )
> - use a sanitize function from sam ruby's mombo/post.py (  i'mworried
> that its from '03 and a ton of regex )
> - rely on formatting into bbcode / mardown / textile
> 
> I'd really like to find something that works like Perl's
> HTML::StripScripts::Parser ( 
> http://search.cpan.org/~drtech/HTML-StripScripts-Parser-1.02/Parser.pm
> )- which will just pull out XSS info and other untrustworthy text.
> 
> Anyone have a suggestion ?

You need to use a parser that also does XSS cleaning.  feedparser.py has 
one.  lxml.html.clean has one.  People have mentioned one based on 
BeautifulSoup, but I don't know where that is.  I'm unclear about 
textile/markdown, as they allow some HTML, but I guess it's just very 
specific markup.  There's an algorithm documented for HTML 5.  Simply 
parsing isn't sufficient, though it is also required for any decent 
cleaner (pure regexes are a bad idea).

lxml.html.clean also has an auto-linker and word wrapping (so long lines 
don't break out of the width... though I think CSS rules can also do 
that).  Some kind of whitespace handling should also be included for 
comments, I think (both vertical and horizontal)... ideally better than 
comment.replace('\n', '<br>\n') (which is just really lame and not even 
very sensible, since newlines in attributes are perfectly fine). 
Python-writing commenters know the pain of vertical whitespace 
insensitivity.  Blogger users know the crappiness of just putting in 
<br>'s willy-nilly.

-- 
Ian Bicking : [EMAIL PROTECTED] : http://blog.ianbicking.org

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"pylons-discuss" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/pylons-discuss?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to