From: "Jon Hanna" <[EMAIL PROTECTED]> > > Knowing that Unicode-ISO/IEC 10646 is a now de facto standard (after being a > > de > > jure one in ISO) will clearly guide those charset developments complying > > with > > Unicode rules and policies, so that such adoption will not create a nightmare > > to > > handle, with unreasonable additional costs for transcoding to/from/through > > Unicode. > > If you can't round-trip directly then the cost is unreasonable.
I never said that. Reread. I said that round-trip conversion is possible even with 1-to-N mappings, provided that such character subset is carefully created so that it will not create ambiguities. > > I see absolutely no problem if new ISO-8859-* variants is added in the > > future > > for better support of African or Asian languages (or even for European ones, > > i.e. Georgian and Armenian), and no opposition of principles if some newer > > ISO2022 charset is created for Canadian Syllabics or Ethiopic if this helps > > processing the corresponding languages. > > If they can be round-tripped trivially (as trivially as the current ISO-8859 > family) then I see no problem either, but I also see little point, and the > motivation gets less every year. Frankly, we have a global encoding now. It has > problems (many of which come from the fact that it was not practical to act as > if we were at encoding year-zero - if we had then we probably wouldn't have > precomposed characters for European languages, never mind any others) but those > problems are considerably less than existed previously and ISO-8859-17+ is > always going to be inferior to UTF-8 or UTF-16. Can you get a reasonnable estimate of what you consider a medium or long term solution? For me ISO-8859-1/2 will continue to be used for very long periods. This is a natural consequence of the _slow_ migration or replacement of working softwares and the cost of new developments. There are many reasons why old software continue to run today when they were developed 20 years ago long before Unicode ever existed. In the computer industry there's a geeneral motto that says "if it works and if it doesn't break, don't change it!". Some of the oldest developped softwares have become so business critical that they have been scrutinized and maintained with extreme care, notably against security vulnerability. Rewriting a new code for these application is a high risk which also exposes to very long and costly compatibility and interoperability tests. You can't simply and immediately replace a piece of software in mission critical applications, you must also make sure that other "companion" softwares will work with it, and you need migration plans that include testing multiple supported interfaces to interact with old softwares, and also making sure that the new code is not exposed to many more, new and undetected, vulnerabilities which were absent from the old software. In some cases, it is even impossible to replace it, and there will be no viable alternative before many years, due to lack of general purpose usage (notably for the many softwares that work with organization-specific data, often kept proprietary and secret). Today there are so many softwares that depend on 8-bit processing with simple assumptions based on 1 byte = 1 character that you won't create a revolution. I bet that 8-bit charsets will continue to be supported in 20 or 30 years, even if these systems are adapted with new interfaces to Unicode-enabled systems. Think about most OS kernels and filesytems, or device configuration: they simply use 8-bit charsets internally and there's no way to adapt them to work with variable-length multibyte encodings (there are too many related security issues for untested cases).

