New submission from Brian Jones <bkjo...@gmail.com>:

In Python 3.2b2, html.entities.codepoint2name and name2codepoint only support 
the 252 HTML entity names defined in the HTML 4 spec from 1997. I'm wondering 
if there's a reason not to support W3C Recommendation 'XML Entity Definitions 
for Characters' 

http://www.w3.org/TR/xml-entity-names/

This standard contains significantly more characters, and it is noted in that 
spec that the HTML 5 drafts use that spec's entities. You can see the current 
HTML 5 'Named character references' here: 

http://www.w3.org/TR/html5/named-character-references.html#named-character-references

If this is just a matter of somebody going in to do the grunt work, let me 
know. 

If startup costs associated with importing a huge dictionary are a concern, 
perhaps a more efficient type that enables the same lookup interface can be 
defined. 

If other reasons exist to not move in this direction, please do let me know!

----------
components: Library (Lib), Unicode, XML
messages: 127865
nosy: Brian.Jones
priority: normal
severity: normal
status: open
title: html.entities mapping dicts need updating?
type: feature request
versions: Python 3.2

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue11113>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to