> The status of these 5 characters is already in the best fit mappings document pointed to by the IANA registry entry for windows-1252, which is strong as I’m willing to go for them.
I don't understand the relation between bestfit1252 and cp1252. Could you clarify it for me? If I read the mapping file right, bestfit1252 declares a definition of cp1252, so it would make sense (to me) if the corresponding parts of the two files matched. As far as I can see, the w3c-cp1252 corresponds to bestfit1252. On Wed, Nov 21, 2012 at 8:58 AM, Shawn Steele <shawn.ste...@microsoft.com>wrote: > I’ll be more definitive than Murray J Our legacy code pages aren’t > going to change. We won’t add more characters to 1252. We won’t add new > code pages. We aren’t going change names (since that’ll break anyone > already using them), we probably won’t recognize new names (since anyone > trying to use a new name wouldn’t work on millions of existing computers, > so no one would add it). **** > > ** ** > > The churn is too painful for customers. If there’s a new character that > everyone “must” use, we’ll point them at UTF-8 or UTF-16. Any request to > change codepage behavior would have to meet a very high bar.**** > > ** ** > > The status of these 5 characters is already in the best fit mappings > document pointed to by the IANA registry entry for windows-1252, which is > strong as I’m willing to go for them.**** > > ** ** > > The last thing I did WRT to code page standards was to ask for the best > fit mappings to be posted so that the IANA charset registry would have > something to reference to clarify the existing names. It’s possible (if I > find the time) that a few of the IANA charset entries could be updated to > emphasize that some common names have differing implementations by > different vendors/OS’s such as was done for shift_jis > http://www.iana.org/assignments/charset-reg/shift_jis or the updates to > point out the best fit mapping for 1252 at > http://www.iana.org/assignments/charset-reg/windows-1252 In other words, > the trend is to clarify that there are variations in behavior, and to > please use Unicode.**** > > ** ** > > Also see:**** > > > http://blogs.msdn.com/b/shawnste/archive/2007/09/24/are-we-going-to-update-or-maintain-the-best-fit-or-code-page-mappings.aspx > **** > > > http://blogs.msdn.com/b/shawnste/archive/2008/01/17/code-pages-and-security-issues.aspx > **** > > > http://blogs.msdn.com/b/shawnste/archive/2007/03/20/some-reasons-to-make-your-application-unicode.aspx > **** > > ** ** > > (and > http://blogs.msdn.com/b/shawnste/archive/2012/06/16/building-the-lego-disney-wonder.aspxjust > because I think it’s cool) > **** > > ** ** > > I can see why HTML5 might think windows-1252 support is a good idea, but > personally I’d’ve been happier if it wasn’t a requirement. Too much code > page corruption happens on the web, and most of the badly-tagged content > probably misdeclares itself as 1252. UTF-8 is a WAY better choice, > particularly for the characters in the set supported by windows-1252.**** > > ** ** > > -Shawn**** > > ( )**** > > ** ** > > SSDE,**** > > Microsoft**** > > ** ** > > *From:* unicode-bou...@unicode.org [mailto:unicode-bou...@unicode.org] *On > Behalf Of *Murray Sargent > *Sent:* Tuesday, November 20, 2012 8:55 PM > *To:* verd...@wanadoo.fr; Doug Ewell > *Cc:* Unicode Mailing List; Buck Golemon > *Subject:* RE: cp1252 decoder implementation**** > > ** ** > > Phillipe commented: “(even if later Microsoft decides to map some other > characters in its own "windows-1252" charset, like it did several times and > notably when the Euro symbol was mapped)”.**** > > ** ** > > Personal opinion, but I’d be very surprised if Microsoft ever changed the > 1252 charset. The euro was added back in 1999 when code pages were still > used a lot. Code pages in general are pretty much irrelevant today except > for reading legacy documents. They are virtually never used internally in > modern software. UTF-8,UTF-16, and UTF-32 are what are used these days.*** > * > > ** ** > > (But code pages do have the advantage that they are associated with > specific character repertoires, which amounts to a great hint for font > binding…)**** > > ** ** > > Murray**** >