On 10/6/2016 7:54 AM, Charlotte Buff wrote:
If theoretically I wanted to convert an old Shift JIS document containing emoji to Unicode, how should I ideally handle Shibuya 109?

And the general answer to that is convert to U+FFFD, unless you are doing something specific and know what you are doing. ... in which case you can use PUA or insert an image, or whatever else you need to do.

This is not a character *standardization* issue that requires the UTC to come up with a generic interchange solution for every pre-Unicode character encoding of everything that ever was, whether it be some oddball Shift JIS extensions that were omitted in the consensus on encoding the Japanese Carrier Emoji:

http://www.unicode.org/reports/tr51/tr51-7.html#Japanese_Carrier

or other odds and ends from bizarre, dead-end, disused character encodings from a previous generation.

By the way, the biggest ongoing problem we deal with here is the continuing urge to proliferate font-encoded hacks for particular languages and writing systems. The text interchange problems that such schemes pose on an ongoing basis far far outweigh issues like what to do with a Shibuya 109 emoji, imo.

--Ken

Reply via email to