On emoji and the two rightwards black arrows

J.S. Choi Fri, 30 Oct 2015 12:24:32 -0700

# On emoji and the two rightwards black arrows

This is a long post, and I apologize for that; it’s a somewhat complicated 
topic. The post is about two encoded characters:
U+27A1 Black Rightwards Arrow <http://www.unicode.org/charts/PDF/U2700.pdf>
and U+2B95 Rightwards Black Arrow <http://www.unicode.org/charts/PDF/U2B00.pdf>.


• The post first reviews their encodings’ respective histories, as I currently 
understand it; hopefully I’m not mistaken about anything.
• It then informally suggests that U+2B95 be added to emoji-data.txt (and 
possibly be given standardized text/emoji variants)—as U+27A1 already has 
been—on the basis that U+2B95 is as equally, if not more, suited than U+27A1 to 
serve as a general rightwards arrow symbol.
• It also proposes that clarification be added to their entries in the code 
charts about the differences between their intended functions, and answering 
when to use one versus the other, as per their contrasting histories.

I don’t intend to be making anything like a formal proposal yet, but I might in 
the future. For now, I’d like to clarify the characters’ respective intended 
purposes and see how feasible or likely the proposed changes would be before 
investing time, etc. in a formal proposal.

## History

The history below is taken from the following posts:

• Ken Whistler:
    • 2015-05 <http://www.unicode.org/mail-arch/unicode-ml/y2015-m05/0272.html>.
    • 2015-10 <http://www.unicode.org/mail-arch/unicode-ml/y2015-m10/0223.html>.
• Mark Davis:
    • 2015-10 <http://www.unicode.org/mail-arch/unicode-ml/y2015-m10/0226.html>.
• Michel Suignard:
    • 2015-05 
<http://www.unicode.org/mail-arch/unicode-ml/y2015-m05/0268.html>. (Note that 
this post contains paragraphs quoted from another person that is not marked 
differently, with Suignard’s replies below each one.)

• 1993: The glyphs from ITC Zapf Dingbats typeface were encoded in the Unicode 
Standard 1.1 for compatibility with PostScript printers that use them. This 
included U+27A1 Black Rightwards Arrow.
The Zapf Dingbat arrows all face rightwards, as generically rotatable arrow 
glyphs. No leftwards, upwards, or downwards versions of arrows were encoded 
because PostScript printers were assumed to rotate generic rightwards arrows in 
original Zapf Dingbats fonts. U+27A1’s representative glyph is taken from Zapf 
Dingbats.

• 2003:  Representatives of North Korea (the DPRK) submitted a proposal to add 
compatibility characters for a DPRK encoding standard 
<http://www.unicode.org/L2/L2001/01349-N2374-DPRK-AddSymbols.pdf>. These 
included black-filled arrows in the four cardinal directions.
The proposal only included leftwards, upwards, and downwards black arrows, 
apparently because the representatives believed that U+27A1 fit their purposes 
for compatibility with their rightwards black arrow.
The former three were encoded as U+2B05–U+2B07 in the Unicode Standard 4.0. 
Their representative glyphs and names were taken from the DPRK proposal; the 
glyphs and names thus did not align with U+27A1 (e.g., U+2B05 Leftwards Black 
Arrow vs. U+27A1 Black Rightwards Arrow). Whistler states that “…nobody 
commented on” them and “nobody much cared, because because these were 
compatibility additions for a DPRK standard, and weren't mapped to any 
commercial sets at the time, anyway” (2015-05).
The unification of new DPRK compatibility arrows U+2B05–U+2B07 with rotations 
of Zapf Dingbat arrow U+27A1 was implied by the Standard but not explicit. For 
the next decade, most fonts implementing all four characters used glyphs 
matching the code charts’ (i.e., using the mismatching Zapf Dingbat glyph for 
the right arrow, and the DPRK glyphs for the other black arrows).

• 2011–2013: Google, Apple, and Microsoft begin to support emoji characters 
from Japanese cellular carriers using characters from the Unicode Standard 6.0. 
Four of those Japanese-carrier characters are black arrows in the four cardinal 
directions (UTR #51).
The three companies use the DPRK-compatibility black arrows U+2B05–U+2B07 for 
three of them. Presumably because it was assumed to be part of their set and 
there was no better alternative, the Zapf Dingbat U+27A1 for the final, 
rightward black arrow from the Japanese-carrier emoji.
Based on then-current usage, these four characters’ mappings, among others, are 
added to a new, separate Unicode data file for emoji data 
<http://www.unicode.org/Public/emoji/1.0/emoji-data.txt>. The data to this data 
have “not yet been formally rationalized into a coherent set of Unicode 
character properties” (Whistler 2015-10), in 

• 2014: A “complete re-rationalization of all the arrows symbols” occurred 
(Whistler 2015-05) in the Unicode Standard 7.0 due to addition of arrows from 
Wingdings, Wingdings 2, and Webdings 
<http://www.unicode.org/L2/L2012/12130-n4239.pdf>.
The DPRK-compatibility black arrows U+2B05–U+2B07 are unified with similar 
Wingding black arrows, and their representative glyphs are modified thus to 
harmonize. However, the glyph of Zapf Dingbat arrow U+27A1 is deemed to be 
unmodifiable, because its identity is strongly coupled to the original arrow 
glyph in the ITC Zapf Dingbat typefaces.
The now-generic black arrows U+2B05–U+2B07 are thus disunified from rotations 
of U+27A1. A new character, U+2B95 Rightwards Black Arrow, is added with the 
intention of completing the U+2B05–U+2B07 set; it receives a correspondingly 
matching representative glyph.

## Present issues

The new U+B295 Rightwards Black Arrow together with the now-generic 
U+2B05–U+2B07 are supposed to form a single set of arrows, with correspondingly 
matching representative glyphs, as Mr. Suignard has said. It will take time for 
U+B295 to be implemented by new fonts, but “the explicit glyph updates for 
U+2B00..U+2B0D…were clearly intentional” (Whistler 2015-05). In other words, 
according to the Standard since version 7.0, the matching character that is the 
rightward version of U+2B05–U+2B07 is now clearly U+B295—not U+27A1, which has 
been disunified from the set and is now merely a Zapf Dingbat.

However, this is still not yet completely true: UTR #51 and emoji-data.txt 
currently define the rightwards version of U+2B05–U+2B07 to be the Zapf Dingbat 
U+27A1. UTR #51 currently does not define U+B295 to be an emoji character. 
Furthermore, there are no text/emoji standardized variants of U+B295 yet, 
unlike U+27A1.

Upon reviewing the history above, it becomes apparent that this is due to 
missed timing between the advent of Unicode emoji (in 2011–2013) and the advent 
of U+B295 (in 2014). Apple, Google, and Microsoft had no character other than 
U+27A1 that they could use for the Japanese carriers’ rightward black arrow; at 
that time U+27A1 was still implicitly unified with the other black arrows.

It seems to be possible to change the emoji data to more logically match the 
intended usage of the new U+B295. My questions are thus:

1. Should U+B295 be added to the set of emoji characters as given by UTR #51 
and emoji-data.txt, with the intent to complete the harmonization with 
U+2B05–U+2B07 in 2014?

2. If #1’s answer is yes, then should U+B295 be given text/emoji standardized 
variation sequences, just as U+2B05–U+2B07 already do?

3. Regardless of the answers to the above, should clarification on the 
conceptual differences between U+B295 (the right black arrow completing 
U+2B05–U+2B07) and U+27A1 (the Zapf Dingbat) be added to their entries in the 
Standard’s code charts? This might clear up a lot of confusion from users and 
font creators, and would only make clearer what has already been made explicit 
by 7.0’s glyph changes.

## Possible objections

There are two objections to #1 and #2 that I could foresee:

First is that, when using emoji, a user might perceive redundancy between an 
emoji form of U+B295 and the already existent emoji form of U+27A1, and this 
might cause user confusion over which one to use. However, this redundancy has 
already existed since Unicode 7.0, when U+B295 was added in the first place. 
The Consortium apparently decided at the time that the risk of user confusion 
between U+B295 and U+27A1 was worth it in regular-text contexts; I don’t see 
why it would be significantly different in emoji contexts. Vendors’ emoji input 
palettes could merely present only U+B295, rather than U+27A1, to the user, 
with little disadvantage.

Second is that compatibility mappings with Japanese carrier sets already use 
U+27A1, and mappings should generally be stable across versions of Unicode. 
However, the Unicode emoji data are not yet formally set in stone; there has 
only been preliminary discussion and the initial publication of UTR #51 
(Whistler 2015-10; Davis 2015-10). The mappings with the carrier sets are 
probably thus not under the same stability guarantees that other formal 
mappings are under (and, even if they are, I could find no policy in 
<http://www.unicode.org/policies/stability_policy.html> that prohibits 
modifying formal mappings in general).

In any case, I might make a formal proposal in the future, but I first want to 
determine here how probable that such a proposal would be discussed. What would 
you say the answers to those three questions are?

Sincerely,
J. S. Choi

On emoji and the two rightwards black arrows

Reply via email to