----- Original Message ----- | From: "Thorsten Glaser" <t...@mirbsd.de> | Cc: "lynx-dev" <lynx-dev@nongnu.org> | Sent: Sunday, June 28, 2020 1:40:48 PM | Subject: Re: [Lynx-dev] rendering — (0x97)
| Thomas Dickey dixit: | |>but in the meantime, the html5 crowd declared that iso-8859-1 is |>identical to cp1252 | | WHAT‽ | | I knew they were crazy, but… like THAT? Here's something relevant: https://encoding.spec.whatwg.org/#names-and-labels I seem to recall reading that in one of those pages summarizing changes for html5. On the other hand, it might be one of those "facts" created in Wikipedia (there's a lot of that). And even if I saw it some other place, Wikipedia might still be the ultimate source. Looking there, I see it evolving since https://en.wikipedia.org/w/index.php?title=Windows-1252&type=revision&diff=267905312&oldid=262016711 https://en.wikipedia.org/w/index.php?title=Windows-1252&type=revision&diff=285015046&oldid=285011908 with the second edit referring to https://web.archive.org/web/20090417231914/http://www.whatwg.org/specs/web-apps/current-work/multipage/infrastructure.html Here's the source for the first edit: https://web.archive.org/web/20090204094727/http://www.whatwg.org/specs/web-apps/current-work/multipage/parsing.html See "8.2.2.2 Character encoding requirements", which (seems familiar) says that ISO-8859-1 should be treated as if it were CP1252. Move forward to 2012, and the wording is amended https://web.archive.org/web/20120930155353/http://www.whatwg.org/specs/web-apps/current-work/multipage/parsing.html and going to 2013, I don't see it anymore. That is, I don't see it in whatwg at that point. But Wikipedia's been updated, and so has whatwq... As of today, here's the current page: https://en.wikipedia.org/w/index.php?title=Windows-1252&oldid=964485118 which says This is now standard behavior in the HTML5 specification, which requires that documents advertised as ISO-8859-1 actually be parsed with the Windows-1252 encoding.[5] [5] "Encoding". WHATWG. 27 January 2015. sec. 5.2 Names and labels. Archived from the original on 4 February 2015. Retrieved 4 February 2015. That is, it points to something that we can read on Internet Archive: https://web.archive.org/web/20150204174315/https://encoding.spec.whatwg.org/#names-and-labels ...and that page does say (in effect) that ISO-8859-1 and several other charsets: "ansi_x3.4-1968" "ascii" "cp1252" "cp819" "csisolatin1" "ibm819" "iso-8859-1" "iso-ir-100" "iso8859-1" "iso88591" "iso_8859-1" "iso_8859-1:1987" "l1" "latin1" "us-ascii" "windows-1252" "x-cp1252" are to be interpreted as CP1252. The current page gives the same information: https://web.archive.org/web/20200613144751/https://encoding.spec.whatwg.org/ | That being said this still is UTF-8, not ISO-8859-1… -- Thomas E. Dickey <dic...@invisible-island.net> http://invisible-island.net ftp://ftp.invisible-island.net _______________________________________________ Lynx-dev mailing list Lynx-dev@nongnu.org https://lists.nongnu.org/mailman/listinfo/lynx-dev