# On emoji and the two rightwards black arrows
This is a long post, and I apologize for that; it’s a somewhat complicated
topic. The post is about two encoded characters:
U+27A1 Black Rightwards Arrow <http://www.unicode.org/charts/PDF/U2700.pdf>
and U+2B95 Rightwards Black Arrow <http://www.unicode.org/charts/PDF/U2B00.pdf>.
• The post first reviews their encodings’ respective histories, as I currently
understand it; hopefully I’m not mistaken about anything.
• It then informally suggests that U+2B95 be added to emoji-data.txt (and
possibly be given standardized text/emoji variants)—as U+27A1 already has
been—on the basis that U+2B95 is as equally, if not more, suited than U+27A1 to
serve as a general rightwards arrow symbol.
• It also proposes that clarification be added to their entries in the code
charts about the differences between their intended functions, and answering
when to use one versus the other, as per their contrasting histories.
I don’t intend to be making anything like a formal proposal yet, but I might in
the future. For now, I’d like to clarify the characters’ respective intended
purposes and see how feasible or likely the proposed changes would be before
investing time, etc. in a formal proposal.
## History
The history below is taken from the following posts:
• Ken Whistler:
• 2015-05 <http://www.unicode.org/mail-arch/unicode-ml/y2015-m05/0272.html>.
• 2015-10 <http://www.unicode.org/mail-arch/unicode-ml/y2015-m10/0223.html>.
• Mark Davis:
• 2015-10 <http://www.unicode.org/mail-arch/unicode-ml/y2015-m10/0226.html>.
• Michel Suignard:
• 2015-05
<http://www.unicode.org/mail-arch/unicode-ml/y2015-m05/0268.html>. (Note that
this post contains paragraphs quoted from another person that is not marked
differently, with Suignard’s replies below each one.)
• 1993: The glyphs from ITC Zapf Dingbats typeface were encoded in the Unicode
Standard 1.1 for compatibility with PostScript printers that use them. This
included U+27A1 Black Rightwards Arrow.
The Zapf Dingbat arrows all face rightwards, as generically rotatable arrow
glyphs. No leftwards, upwards, or downwards versions of arrows were encoded
because PostScript printers were assumed to rotate generic rightwards arrows in
original Zapf Dingbats fonts. U+27A1’s representative glyph is taken from Zapf
Dingbats.
• 2003: Representatives of North Korea (the DPRK) submitted a proposal to add
compatibility characters for a DPRK encoding standard
<http://www.unicode.org/L2/L2001/01349-N2374-DPRK-AddSymbols.pdf>. These
included black-filled arrows in the four cardinal directions.
The proposal only included leftwards, upwards, and downwards black arrows,
apparently because the representatives believed that U+27A1 fit their purposes
for compatibility with their rightwards black arrow.
The former three were encoded as U+2B05–U+2B07 in the Unicode Standard 4.0.
Their representative glyphs and names were taken from the DPRK proposal; the
glyphs and names thus did not align with U+27A1 (e.g., U+2B05 Leftwards Black
Arrow vs. U+27A1 Black Rightwards Arrow). Whistler states that “…nobody
commented on” them and “nobody much cared, because because these were
compatibility additions for a DPRK standard, and weren't mapped to any
commercial sets at the time, anyway” (2015-05).
The unification of new DPRK compatibility arrows U+2B05–U+2B07 with rotations
of Zapf Dingbat arrow U+27A1 was implied by the Standard but not explicit. For
the next decade, most fonts implementing all four characters used glyphs
matching the code charts’ (i.e., using the mismatching Zapf Dingbat glyph for
the right arrow, and the DPRK glyphs for the other black arrows).
• 2011–2013: Google, Apple, and Microsoft begin to support emoji characters
from Japanese cellular carriers using characters from the Unicode Standard 6.0.
Four of those Japanese-carrier characters are black arrows in the four cardinal
directions (UTR #51).
The three companies use the DPRK-compatibility black arrows U+2B05–U+2B07 for
three of them. Presumably because it was assumed to be part of their set and
there was no better alternative, the Zapf Dingbat U+27A1 for the final,
rightward black arrow from the Japanese-carrier emoji.
Based on then-current usage, these four characters’ mappings, among others, are
added to a new, separate Unicode data file for emoji data
<http://www.unicode.org/Public/emoji/1.0/emoji-data.txt>. The data to this data
have “not yet been formally rationalized into a coherent set of Unicode
character properties” (Whistler 2015-10), in
• 2014: A “complete re-rationalization of all the arrows symbols” occurred
(Whistler 2015-05) in the Unicode Standard 7.0 due to addition of arrows from
Wingdings, Wingdings 2, and Webdings
<http://www.unicode.org/L2/L2012/12130-n4239.pdf>.
The DPRK-compatibility black arrows U+2B05–U+2B07 are unified with similar
Wingding black arrows, and their representative glyphs are modified thus to
harmonize. However, the glyph of Zapf Dingbat arrow U+27A1 is deemed to be
unmodifiable, because its identity is strongly coupled to the original arrow
glyph in the ITC Zapf Dingbat typefaces.
The now-generic black arrows U+2B05–U+2B07 are thus disunified from rotations
of U+27A1. A new character, U+2B95 Rightwards Black Arrow, is added with the
intention of completing the U+2B05–U+2B07 set; it receives a correspondingly
matching representative glyph.
## Present issues
The new U+B295 Rightwards Black Arrow together with the now-generic
U+2B05–U+2B07 are supposed to form a single set of arrows, with correspondingly
matching representative glyphs, as Mr. Suignard has said. It will take time for
U+B295 to be implemented by new fonts, but “the explicit glyph updates for
U+2B00..U+2B0D…were clearly intentional” (Whistler 2015-05). In other words,
according to the Standard since version 7.0, the matching character that is the
rightward version of U+2B05–U+2B07 is now clearly U+B295—not U+27A1, which has
been disunified from the set and is now merely a Zapf Dingbat.
However, this is still not yet completely true: UTR #51 and emoji-data.txt
currently define the rightwards version of U+2B05–U+2B07 to be the Zapf Dingbat
U+27A1. UTR #51 currently does not define U+B295 to be an emoji character.
Furthermore, there are no text/emoji standardized variants of U+B295 yet,
unlike U+27A1.
Upon reviewing the history above, it becomes apparent that this is due to
missed timing between the advent of Unicode emoji (in 2011–2013) and the advent
of U+B295 (in 2014). Apple, Google, and Microsoft had no character other than
U+27A1 that they could use for the Japanese carriers’ rightward black arrow; at
that time U+27A1 was still implicitly unified with the other black arrows.
It seems to be possible to change the emoji data to more logically match the
intended usage of the new U+B295. My questions are thus:
1. Should U+B295 be added to the set of emoji characters as given by UTR #51
and emoji-data.txt, with the intent to complete the harmonization with
U+2B05–U+2B07 in 2014?
2. If #1’s answer is yes, then should U+B295 be given text/emoji standardized
variation sequences, just as U+2B05–U+2B07 already do?
3. Regardless of the answers to the above, should clarification on the
conceptual differences between U+B295 (the right black arrow completing
U+2B05–U+2B07) and U+27A1 (the Zapf Dingbat) be added to their entries in the
Standard’s code charts? This might clear up a lot of confusion from users and
font creators, and would only make clearer what has already been made explicit
by 7.0’s glyph changes.
## Possible objections
There are two objections to #1 and #2 that I could foresee:
First is that, when using emoji, a user might perceive redundancy between an
emoji form of U+B295 and the already existent emoji form of U+27A1, and this
might cause user confusion over which one to use. However, this redundancy has
already existed since Unicode 7.0, when U+B295 was added in the first place.
The Consortium apparently decided at the time that the risk of user confusion
between U+B295 and U+27A1 was worth it in regular-text contexts; I don’t see
why it would be significantly different in emoji contexts. Vendors’ emoji input
palettes could merely present only U+B295, rather than U+27A1, to the user,
with little disadvantage.
Second is that compatibility mappings with Japanese carrier sets already use
U+27A1, and mappings should generally be stable across versions of Unicode.
However, the Unicode emoji data are not yet formally set in stone; there has
only been preliminary discussion and the initial publication of UTR #51
(Whistler 2015-10; Davis 2015-10). The mappings with the carrier sets are
probably thus not under the same stability guarantees that other formal
mappings are under (and, even if they are, I could find no policy in
<http://www.unicode.org/policies/stability_policy.html> that prohibits
modifying formal mappings in general).
In any case, I might make a formal proposal in the future, but I first want to
determine here how probable that such a proposal would be discussed. What would
you say the answers to those three questions are?
Sincerely,
J. S. Choi