[issue5902] Stricter codec names

Marc-Andre Lemburg Thu, 24 Feb 2011 01:44:54 -0800

Marc-Andre Lemburg <[email protected]> added the comment:

Alexander Belopolsky wrote:
> 
> Alexander Belopolsky <[email protected]> added the comment:
> 
>> Accepting all common forms for
>> encoding names means that you can usually give Python an encoding name
>> from, e.g. a HTML page, or any other file or system that specifies an
>> encoding.
> 
> I don't buy this argument.  Running attached script on 
> http://www.iana.org/assignments/character-sets shows that there are hundreds 
> of registered charsets that are not accepted by python:
> 
> $ ./python.exe iana.py| wc -l
>      413
> 
> Any serious HTML or XML processing software should be based on the IANA 
> character-sets file rather than on the ad-hoc list of aliases that made it 
> into encodings/aliases.py.


Let's do a reality check:

How often do you see requests for additions to the aliases we
have in Python ? Perhaps one every year, if at all.

We take great care not to add aliases that are not in common
use or that do not have a proven track record of really being
compatible to the codec in question.

If you think we are missing some aliases, please open tickets
for them, indicating why these should be added.

If you really want complete IANA coverage, I suggest you create
a normalization module which maps the IANA names to our names
and upload it to PyPI.

----------

_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue5902>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue5902] Stricter codec names

Reply via email to