Martin J. Dürst wrote:

If he is targeting HTML5, then none of this matters, because HTML5
says that ISO 8859-1 is really Windows-1252.

Yes. But unless Python wants to limit its use to HTML5, this should be
handled on a separate level (mapping a "iso-8859-1" label to the
Windows-1252 decoder logic), not by trying to change ISO-8859-1
itself.

Normally I would agree. The HTML5 standard itself requires the mapping of one encoding to another, but that could be handled at the appropriate level, as you said. However, the redefinition of mapping algorithms and tables is another matter.

The "Encoding Living Specification" appears not to be normative, but if it is to be followed, it essentially requires all conforming encoders and decoders to be either rewritten or at least reviewed for conformance. It specifies every step that the software must perform, which may not match the steps performed by existing software, and it specifies mapping tables which may not match the tables previously published by vendors and SDOs.

If this document is taken up as a standard part of HTML5, it is not hard to imagine that languages like Python will need to implement two flavors of encoders and decoders: "legacy" and "HTML5-compliant."

--
Doug Ewell | Thornton, Colorado, USA
http://www.ewellic.org | @DougEwell ­

Reply via email to