2012/6/1 Doug Ewell <[email protected]>: > Peter Constable <petercon at microsoft dot com> wrote: > >> The only requirement of Unicode was to provide a way to map Shift-JIS >> encoded text involving emoji to Unicode / 10646 in a way that could be >> round-tripped, > > This is the part that has always confused me. At what point does text > encoded in a vendor's private-use extension to Shift-JIS become > "Shift-JIS encoded text"? Because I know for sure that I'm not supposed > to refer to characters assigned to the Unicode PUA, my own or anyone > else's, as being "encoded in Unicode."
May be because, without admitting it publicly, those symbols really have a much wider use than in these private Shift-JIs extensions. In which case, the need for roundtrip compatibility is definitely not the main reason for their encoding, and these symbols should be considered more globally (as they are certainly needed in other countries or for other private implementations, but without the interoperability that one could expect between these implementations when they obviously mean the same thing and play the same role in texts including them). The private extension is just a sign that it was needed. The pressure to include them in standard Shift-JIS is another sign, and then the need to map them as well into the UCS, via their standardization in Shift-JIS, whever it succeeds or not in that standard). Of course, encoding flags visually in an international standard is much more difficult, if one wants to encode some flags and not some others, also because of political issues. That's why I propose another way to represent them. This won't affect the private-use Shift-JIS encoding, which can now have a roundtrip compatibility with its existing symbols, even if the standard Shift-JIS will now prefer using the more generic symbols instead of integrating the private-use extension.

