Mark Davis 😎 wrote:
There are superset relations among some of the CJK character sets, and also -- practically speaking -- between some of the windows and ISO-8859 sets. I say practically speaking because in general environments, the C1 controls are really unused, so where a non ISO-8859 set is same except for 80..9F you can treat it pragmatically as a superset.
There was a time, about 10 years ago, when Frank da Cruz would have replied almost immediately about the importance of C1 controls in terminal environments, and the arguments about incompatibility between 8859-1 and Windows-1252 would have been off and running.
That was about the same time that people like Roman Czyborra were complaining that their terminals were scrambled by text encoded in UTF-8, because of its use of bytes in the 80..9F range, and people like Jörg Knappen were creating alternative UTF's to get around this perceived problem.
Regarding the subset/superset terminology, we need to distinguish between "encoding subsets" and "repertoire subsets":
* ASCII is both an encoding subset and a repertoire subset of 8859-1 and Windows-1252 and UTF-8.
* 8859-1 is an encoding subset of Windows-1252, except for the 80..9F range.
* 8859-1 and Windows-1252 are repertoire subsets, but not encoding subsets, of UTF-8.
* 8859-15 is neither type of subset of 8859-1. * Etc. -- Doug Ewell | Thornton, Colorado, USA | http://www.ewellic.org RFC 5645, 4645, UTN #14 | ietf-languages @ is dot gd slash 2kf0s Â

