> In contrast, bringing the cp1252 definition into line with real > implementations and recommending UTF-8 for new developments are not mutually > exclusive.
Exactly? If you already have existing data in 1252 or a variation (and can't tell them apart), then nothing's gained by making NEW requirements for 1252 which the old data won't conform to. Changing standards or behavior will only break things that already work. If you're creating new data, it should be using UTF-8 to avoid these kinds of ambiguity. -Shawn On Fri, Dec 7, 2012 at 4:41 PM, Shawn Steele <[email protected]<mailto:[email protected]>> wrote: It's a variation. The undefined codepoints in 1252 probably shouldn't be used, and I can't imagine that adding a code page helps anything, nor that changing an existing behavior helps anything. People really should be using UTF-8. -Shawn From: Buck Golemon [mailto:[email protected]<mailto:[email protected]>] Sent: Friday, December 7, 2012 4:34 PM To: Shawn Steele Cc: unicode Subject: Re: data for cp1252 I've been told that bestfit1252 wasn't meant to redefine the cp1252 mapping, although its first line declares "CODEPAGE 1252". Is it a separate encoding or not? If so, I'll submit a new "bestfit1252" to the python stdlib. If not, I believe the cp1252 mapping needs brought into line. On Fri, Dec 7, 2012 at 4:27 PM, Shawn Steele <[email protected]<mailto:[email protected]>> wrote: J

