Hi Dave, Html5lib looks like a well maintained and active project. The stack overflow clone white-lists a subset of the default safe elements (e.g. no button elements) which looks alright to me. Of course, I'm no expert at this, so don't quote me on that :)
Thanks for sharing, Chris On Jan 21, 4:36 pm, Dave <[email protected]> wrote: > Thanks Chris and Alexander, > > I took a look at both... from the links I also > foundhttp://code.google.com/p/soclone/source/browse/trunk/soclone/utils/ht... > which uses html5lib. It puts a wrapper on html5lib and helped me > figure out how to make it work. > > What is wicked cool is that what appeared to be a nightmare seems to > work just great. For others attempting same thing do this: > 1- get & install html5lib. Note: phthon manage.py install failed for > me so i just copied it to my project folder. > 2- get the code from link above and save it file in your project (i.e. > htmlsanitize.py) > 3- I run the code as a clean in my forms(i.e. def clean_comment) such > as below: > > def clean_comment(self): > import htmlsanitize > data = > htmlsanitize.sanitize_html(self.cleaned_data['comment']) > return data > > So far so good for me. > > Would love to hear 'thumbs up' or 'thumbs down' if this is a good > approach. > > thx again > > Dave > > Chris Tan wrote: > > Check out: > >http://feedparser.org/docs/html-sanitization.html > > > On Jan 21, 2:47 pm, Dave <[email protected]> wrote: > > > There must be an easy answer for this problem and I almost feel dumb > > > for asking.... BUT I can't figure it out and have spent too much time > > > trying. The scenerio is a comment/blog situation. I am using tinyMCE > > > which is creating 'trustable' html. I can display this with django by > > > using {{field|safe}}... all is good. > > > > The problem is some bozo will have their way with the textarea by > > > turning of their javascript. So I'm trying to figure out best way to > > > sanitize the data. The normal escaping of data won't work because it > > > clobbers the 'good' html from tinyMCE. Anyway would be good to > > > sanitize even the tinyMCE generated html. > > > > I've been looking at using html5 lib/parser but can't seem to get it > > > to work. I've even gone through creating a replace method to escape > > > everything and then put back the 'good' tags. However, that seems like > > > a round-about way to go and get's really nasty when considering img, > > > span, etc. tags tinyMCE creates so nicely. Surely many have come > > > across this and there an easy answer. > > > > All suggestions and recommendations are greatly appreciated. > > > > thx, > > > > Dave --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~----------~----~----~----~------~----~------~--~---
