In article <[EMAIL PROTECTED]>, Matimus <[EMAIL PROTECTED]> wrote: >On Jun 4, 6:31 am, "js " <[EMAIL PROTECTED]> wrote: >> Hi list. >> >> If I'm not mistaken, in python, there's no standard library to convert >> html entities, like & or > into their applicable characters. >> >> htmlentitydefs provides maps that helps this conversion, >> but it's not a function so you have to write your own function >> make use of htmlentitydefs, probably using regex or something. >> >> To me this seemed odd because python is known as >> 'Batteries Included' language. >> >> So my questions are >> 1. Why doesn't python have/need entity encoding/decoding? >> 2. Is there any idiom to do entity encode/decode in python? >> >> Thank you in advance. > >I think this is the standard idiom: > >>>> import xml.sax.saxutils as saxutils >>>> saxutils.escape("&") >'&' >>>> saxutils.unescape(">") >'>' >>>> saxutils.unescape("A bunch of text with entities: & > <") >'A bunch of text with entities: & > <' > >Notice there is an optional parameter (a dict) that can be used to >define additional entities as well. . . . Good points; I like your mention of the optional entity dictionary.
It's possible that your solution is to a different problem than the original poster intended. <URL: http://wiki.python.org/moin/EscapingHtml > has de- tails about HTML entities vs. XML entities. -- http://mail.python.org/mailman/listinfo/python-list