PottyMouth: http://devsuki.com/pottymouth/
I recommend it. On Mon, Jun 30, 2008 at 2:08 PM, Ian Bicking <[EMAIL PROTECTED]> wrote: > > Jonathan Vanasco wrote: >> I need to sanitize user input for 'comments' and 'postings'. >> >> Can anyone suggest good ways to handle this? >> >> Browsing the web and other projects, it seems most people do this: >> - use beautiful soup ( which i think might be overkill ) >> - use a sanitize function from sam ruby's mombo/post.py ( i'mworried >> that its from '03 and a ton of regex ) >> - rely on formatting into bbcode / mardown / textile >> >> I'd really like to find something that works like Perl's >> HTML::StripScripts::Parser ( >> http://search.cpan.org/~drtech/HTML-StripScripts-Parser-1.02/Parser.pm >> )- which will just pull out XSS info and other untrustworthy text. >> >> Anyone have a suggestion ? > > You need to use a parser that also does XSS cleaning. feedparser.py has > one. lxml.html.clean has one. People have mentioned one based on > BeautifulSoup, but I don't know where that is. I'm unclear about > textile/markdown, as they allow some HTML, but I guess it's just very > specific markup. There's an algorithm documented for HTML 5. Simply > parsing isn't sufficient, though it is also required for any decent > cleaner (pure regexes are a bad idea). > > lxml.html.clean also has an auto-linker and word wrapping (so long lines > don't break out of the width... though I think CSS rules can also do > that). Some kind of whitespace handling should also be included for > comments, I think (both vertical and horizontal)... ideally better than > comment.replace('\n', '<br>\n') (which is just really lame and not even > very sensible, since newlines in attributes are perfectly fine). > Python-writing commenters know the pain of vertical whitespace > insensitivity. Blogger users know the crappiness of just putting in > <br>'s willy-nilly. > > -- > Ian Bicking : [EMAIL PROTECTED] : http://blog.ianbicking.org > > > > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "pylons-discuss" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/pylons-discuss?hl=en -~----------~----~----~----~------~----~------~--~---
