PottyMouth: http://devsuki.com/pottymouth/

I recommend it.

On Mon, Jun 30, 2008 at 2:08 PM, Ian Bicking <[EMAIL PROTECTED]> wrote:
>
> Jonathan Vanasco wrote:
>> I need to sanitize user input for 'comments' and 'postings'.
>>
>> Can anyone suggest good ways to handle this?
>>
>> Browsing the web and other projects, it seems most people do this:
>> - use beautiful soup ( which i think might be overkill )
>> - use a sanitize function from sam ruby's mombo/post.py (  i'mworried
>> that its from '03 and a ton of regex )
>> - rely on formatting into bbcode / mardown / textile
>>
>> I'd really like to find something that works like Perl's
>> HTML::StripScripts::Parser ( 
>> http://search.cpan.org/~drtech/HTML-StripScripts-Parser-1.02/Parser.pm
>> )- which will just pull out XSS info and other untrustworthy text.
>>
>> Anyone have a suggestion ?
>
> You need to use a parser that also does XSS cleaning.  feedparser.py has
> one.  lxml.html.clean has one.  People have mentioned one based on
> BeautifulSoup, but I don't know where that is.  I'm unclear about
> textile/markdown, as they allow some HTML, but I guess it's just very
> specific markup.  There's an algorithm documented for HTML 5.  Simply
> parsing isn't sufficient, though it is also required for any decent
> cleaner (pure regexes are a bad idea).
>
> lxml.html.clean also has an auto-linker and word wrapping (so long lines
> don't break out of the width... though I think CSS rules can also do
> that).  Some kind of whitespace handling should also be included for
> comments, I think (both vertical and horizontal)... ideally better than
> comment.replace('\n', '<br>\n') (which is just really lame and not even
> very sensible, since newlines in attributes are perfectly fine).
> Python-writing commenters know the pain of vertical whitespace
> insensitivity.  Blogger users know the crappiness of just putting in
> <br>'s willy-nilly.
>
> --
> Ian Bicking : [EMAIL PROTECTED] : http://blog.ianbicking.org
>
> >
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"pylons-discuss" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/pylons-discuss?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to