New submission from Brian Jones <bkjo...@gmail.com>: In Python 3.2b2, html.entities.codepoint2name and name2codepoint only support the 252 HTML entity names defined in the HTML 4 spec from 1997. I'm wondering if there's a reason not to support W3C Recommendation 'XML Entity Definitions for Characters'
http://www.w3.org/TR/xml-entity-names/ This standard contains significantly more characters, and it is noted in that spec that the HTML 5 drafts use that spec's entities. You can see the current HTML 5 'Named character references' here: http://www.w3.org/TR/html5/named-character-references.html#named-character-references If this is just a matter of somebody going in to do the grunt work, let me know. If startup costs associated with importing a huge dictionary are a concern, perhaps a more efficient type that enables the same lookup interface can be defined. If other reasons exist to not move in this direction, please do let me know! ---------- components: Library (Lib), Unicode, XML messages: 127865 nosy: Brian.Jones priority: normal severity: normal status: open title: html.entities mapping dicts need updating? type: feature request versions: Python 3.2 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue11113> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com