Maybe everyone knews about this, but I didn't and searching across the
web I found many sophisticated and weird methods to handle html
entities.

But so far there is a module xml.sax.saxutils, which together with
urlib.quote*() and urlib.unquote*() provides every possible conversion/
encoding/decoding from/to html entities.

I wrote some examples, maybe for someone it will be usefull.

#!/usr/bin/python
from xml.sax import saxutils
import urllib

# ===== QUOTE Examples =====
print "urllib.quote_plus('<p> hello & world </p>'):"
quotePlusOutput = urllib.quote_plus('<p> hello & world </p>')
print quotePlusOutput,'\n'

print "urllib.unquote_plus('%s'):" % quotePlusOutput
print urllib.unquote_plus(quotePlusOutput),'\n'

# ===== Entities Examples =====
entities = {' ':'&nbsp;'}

print "saxutils.escape('<p>  hello & world  </p>', entities):"
saxEscapeOutput = saxutils.escape('<p>  hello & world  </p>',
entities)
print saxEscapeOutput,'\n'

entities={'&nbsp;':' '}
print "saxutils.unescape('%s', entities):" % saxEscapeOutput
print saxutils.unescape( saxEscapeOutput, entities)

--

You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.


Reply via email to