Hi Dave,

Html5lib looks like a well maintained and active project.
The stack overflow clone white-lists a subset of the default safe
elements (e.g. no button elements) which looks alright to me.  Of
course, I'm no expert at this, so don't quote me on that :)

Thanks for sharing,

Chris


On Jan 21, 4:36 pm, Dave <[email protected]> wrote:
> Thanks Chris and Alexander,
>
> I took a look at both... from the links I also 
> foundhttp://code.google.com/p/soclone/source/browse/trunk/soclone/utils/ht...
> which uses html5lib. It puts a wrapper on html5lib and helped me
> figure out how  to make it work.
>
> What is wicked cool is that what appeared to be a nightmare seems to
> work just great. For others attempting same thing do this:
> 1- get & install html5lib. Note: phthon manage.py install failed for
> me so i just copied it to my project folder.
> 2- get the code from link above and save it file in your project (i.e.
> htmlsanitize.py)
> 3- I run the code as a clean in my forms(i.e. def clean_comment) such
> as below:
>
>         def clean_comment(self):
>                 import htmlsanitize
>                 data = 
> htmlsanitize.sanitize_html(self.cleaned_data['comment'])
>                 return data
>
> So far so good for me.
>
> Would love to hear 'thumbs up' or 'thumbs down' if this is a good
> approach.
>
> thx again
>
> Dave
>
> Chris Tan wrote:
> > Check out:
> >http://feedparser.org/docs/html-sanitization.html
>
> > On Jan 21, 2:47 pm, Dave <[email protected]> wrote:
> > > There must be an easy answer for this problem and I almost feel dumb
> > > for asking.... BUT I can't figure it out and have spent too much time
> > > trying. The scenerio is a comment/blog situation. I am using tinyMCE
> > > which is creating 'trustable' html. I can display this with django by
> > > using {{field|safe}}... all is good.
>
> > > The problem is some bozo will have their way with the textarea by
> > > turning of their javascript. So I'm trying to figure out best way to
> > > sanitize the data. The normal escaping of data won't work because it
> > > clobbers the 'good' html from tinyMCE. Anyway would be good to
> > > sanitize even the tinyMCE generated html.
>
> > > I've been looking at using html5 lib/parser but can't seem to get it
> > > to work. I've even gone through creating a replace method to escape
> > > everything and then put back the 'good' tags. However, that seems like
> > > a round-about way to go and get's really nasty when considering img,
> > > span, etc. tags tinyMCE creates so nicely. Surely many have come
> > > across this and there an easy answer.
>
> > > All suggestions and recommendations are greatly appreciated.
>
> > > thx,
>
> > > Dave
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to