> In contrast, bringing the cp1252 definition into line with real 
> implementations and recommending UTF-8 for new developments are not mutually 
> exclusive.

Exactly?

If you already have existing data in 1252 or a variation (and can't tell them 
apart), then nothing's gained by making NEW requirements for 1252 which the old 
data won't conform to.  Changing standards or behavior will only break things 
that already work.

If you're creating new data, it should be using UTF-8 to avoid these kinds of 
ambiguity.

-Shawn

On Fri, Dec 7, 2012 at 4:41 PM, Shawn Steele 
<[email protected]<mailto:[email protected]>> wrote:
It's a variation.  The undefined codepoints in 1252 probably shouldn't be used, 
and I can't imagine that adding a code page helps anything, nor that changing 
an existing behavior helps anything.  People really should be using UTF-8.

-Shawn

From: Buck Golemon [mailto:[email protected]<mailto:[email protected]>]
Sent: Friday, December 7, 2012 4:34 PM
To: Shawn Steele
Cc: unicode

Subject: Re: data for cp1252

I've been told that bestfit1252 wasn't meant to redefine the cp1252 mapping, 
although its first line declares "CODEPAGE 1252".

Is it a separate encoding or not?

If so, I'll submit a new "bestfit1252" to the python stdlib.
If not, I believe the cp1252 mapping needs brought into line.


On Fri, Dec 7, 2012 at 4:27 PM, Shawn Steele 
<[email protected]<mailto:[email protected]>> wrote:
J


Reply via email to