Mark Davis 😎 wrote:

There are superset relations among some of the CJK character sets, and also -- practically speaking -- between some of the windows and ISO-8859 sets. I say practically speaking because in general environments, the C1 controls are really unused, so where a non ISO-8859 set is same except for 80..9F you can treat it pragmatically as a superset.

There was a time, about 10 years ago, when Frank da Cruz would have replied almost immediately about the importance of C1 controls in terminal environments, and the arguments about incompatibility between 8859-1 and Windows-1252 would have been off and running.

That was about the same time that people like Roman Czyborra were complaining that their terminals were scrambled by text encoded in UTF-8, because of its use of bytes in the 80..9F range, and people like Jörg Knappen were creating alternative UTF's to get around this perceived problem.

Regarding the subset/superset terminology, we need to distinguish between "encoding subsets" and "repertoire subsets":

* ASCII is both an encoding subset and a repertoire subset of 8859-1 and Windows-1252 and UTF-8.

* 8859-1 is an encoding subset of Windows-1252, except for the 80..9F range.

* 8859-1 and Windows-1252 are repertoire subsets, but not encoding subsets, of UTF-8.

* 8859-15 is neither type of subset of 8859-1.

* Etc.

--
Doug Ewell | Thornton, Colorado, USA | http://www.ewellic.org
RFC 5645, 4645, UTN #14 | ietf-languages @ is dot gd slash 2kf0s ­


Reply via email to