Jonathan Vanasco wrote:
> I need to sanitize user input for 'comments' and 'postings'.
>
> Can anyone suggest good ways to handle this?
>
> Browsing the web and other projects, it seems most people do this:
> - use beautiful soup ( which i think might be overkill )
> - use a sanitize function from sam ruby's mombo/post.py ( i'mworried
> that its from '03 and a ton of regex )
> - rely on formatting into bbcode / mardown / textile
>
> I'd really like to find something that works like Perl's
> HTML::StripScripts::Parser (
> http://search.cpan.org/~drtech/HTML-StripScripts-Parser-1.02/Parser.pm
> )- which will just pull out XSS info and other untrustworthy text.
>
> Anyone have a suggestion ?
You need to use a parser that also does XSS cleaning. feedparser.py has
one. lxml.html.clean has one. People have mentioned one based on
BeautifulSoup, but I don't know where that is. I'm unclear about
textile/markdown, as they allow some HTML, but I guess it's just very
specific markup. There's an algorithm documented for HTML 5. Simply
parsing isn't sufficient, though it is also required for any decent
cleaner (pure regexes are a bad idea).
lxml.html.clean also has an auto-linker and word wrapping (so long lines
don't break out of the width... though I think CSS rules can also do
that). Some kind of whitespace handling should also be included for
comments, I think (both vertical and horizontal)... ideally better than
comment.replace('\n', '<br>\n') (which is just really lame and not even
very sensible, since newlines in attributes are perfectly fine).
Python-writing commenters know the pain of vertical whitespace
insensitivity. Blogger users know the crappiness of just putting in
<br>'s willy-nilly.
--
Ian Bicking : [EMAIL PROTECTED] : http://blog.ianbicking.org
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"pylons-discuss" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at
http://groups.google.com/group/pylons-discuss?hl=en
-~----------~----~----~----~------~----~------~--~---