Re: RFC: controlling bidirectional mirroring of characters

Erik Carvalhal Miller via Unicode Mon, 29 Dec 2025 16:49:29 -0800

Not everybody missed your message, but in a holiday season one may
easily be missing the time to reply.  Some comments follow
nonetheless.  Disclaimer:  Iʼm just some guy who reads the list.

I followed the discussion in April with interest, and I congratulate
you on your draft, for it shows a lot of thought and some creative
problem‐solving.  That said, Iʼm about to rip it apart.

The draft explicitly anticipates pushback on the “extensions”, and
thatʼs a good place to start.  The draft largely refrains from
providing justification for the extension charactersʼ adoption:  The
customary images of proposed characters in real‐world use are absent
(yes, these are formatters one does not see directly, but still it
should be possible to provide images showing the intended effects in
existing contexts); and even in the case of novel characters such as
many emoji and newly devised currency symbols, successful proposals
are able to provide evidence of the charactersʼ likely real‐life
utility.  Instead, the draft resorts to rationales such as
“inoffensive” and “[w]hy not[…]?”, which are hardly proactive,
compelling arguments.  The First Natural Extension (re NEVER SUBJECT
TO MIRRORING) indeed argues eloquently against itself, telling us its
intended effect is already available in Unicode (as LRI…PDI).  The
Second Natural Extension isnʼt forthcoming about a similar problem
with ALWAYS SUBJECT TO MIRRORING (think: RLI…PDI), and itʼs a mystery
what useful functionality REVERSED SUBJECT TO MIRRORING and INVERSE
SUBJECT TO MIRRORING provide (do we really need a new way — let alone
two — to turn GREATER-THAN SIGN into its own already encoded
opposite?).  Regarding the Final Potential Extensionʼs VERTICALLY
SUBJECT TO MIRRORING, I laud the draft for remembering vertical text
(though here itʼs about rotation, not mirroring, right?), but again
thereʼs no attempt to convince beyond an unconvincing “might as well”.

Even if the extensionsʼ rationales are bolstered, I expect a highly
significant problem to remain irremediable:  The extensions run
terribly afoul of some of the Unicode Design Principles (§ 2.2 of the
Core Specification), without compensatory benefit satisfying any of
the other Design Principles or some other significant consideration.
The primary issue is with the principle of Plain Text:  Expansion of
mirroring to all visible characters (setting aside the question of
symmetry), as in the Third Natural Extension, or else to just the
directionally neutral ones, as more generally proposed, is a gimmickry
that litters plain text with markup for special effects generally
better served by higher‐level protocols or images.  The wholesale
multiplication of superfluously homoglyphic encodings erodes the
principles of Efficiency and Unification (what characters are we
really minding when we mind our pʼs and qʼs?).

So, letʼs return to the draftʼs Core Proposal.  Since it applies to
rather a large repertoire of characters, the same problems occur:
Itʼs not clear why we need a plain‐text mechanism to specify (for
example) a reversed AMPERSAND or OCR BRANCH BANK IDENTIFICATION or
KANGXI RADICAL DRAGON or PLAYING CARD KING OF HEARTS or to sometimes
make members of such character pairs as MODIFIER LETTER ACUTE ACCENT &
MODIFIER LETTER GRAVE ACCENT or IDEOGRAPHIC DESCRIPTION CHARACTER
SURROUND FROM UPPER LEFT & IDEOGRAPHIC DESCRIPTION CHARACTER SURROUND
FROM UPPER RIGHT resemble one another.  While applying to emoji the
draft doesnʼt explicitly address them, and so it ignores the fact that
emoji already have a burgeoning mechanism for specifying directional
orientation (which, incidentally, involves arrow characters), and itʼs
unclear how DIRECTIONALLY SUBJECT TO MIRRORING would interact with
that or with emoji ZWJ sequences in general, with pairs of regional
indicator symbols, or with emoji tag sequences.  The original
discussion focused on arrow characters such as U+2192 RIGHTWARDS ARROW
and U+2190 LEFTWARDS ARROW commonly seen in ordinary text, and there
was some proffered justification for exploring a bidi‐mirroring
solution for a small set of such arrows, and accordingly I would have
expected any proposal with a chance of success to cover only such
specific characters with specific, articulated justification.  (The
draftʼs mention of FRACTION SLASH puzzles me because I am unaware of
bidi and mirroring issues involving inline or diagonal fractions; if a
mirrored fraction solidus is needed for at least one RTL script, the
draft should explain the need and explain why applying a
mirroring‐formatting character is a better solution than, say,
adopting a RIGHT-TO-LEFT FRACTION SLASH character.)

On the subject of justification:  The draft cites five “Real‐World Use
Cases” regarding arrow characters.  Of them, the first three are said
to have been resolved using higher‐level protocols, and the fifth is
said to remain unresolved but to be inappropriate for the draftʼs
proposed mechanism.  The fourth, which regards automatic replacement
of sequences such as ⟨-⁠-⁠>⟩ with arrows, is said to be unresolvable
unless, essentially, the software engineers reëngineer the software to
achieve something that other software achieves — this does not sound
like a strong argument, particularly for a niche convenience feature
(if it is convenient — I am reminded of fora which transform the
sequence <RIGHT PARENTHESIS, COLON> into a sad face athwart my intent,
thereby giving me a sad face for real).  Other solutions abound, for
example:
 • not making the replacement
 • providing users with insert‐arrow buttons
 • replacing only the hyphen‐minuses with a character serving as the
arrow stem, such as U+23AF HORIZONTAL LINE EXTENSION (e.g., ⟨A-->B &
א-->ב⟩ → ⟨A⎯>B & א⎯>ב⟩)

Though rooted in technical details, the draft leaves some significant
technical issues unaddressed.  It proposes DSM as a “combining
formatting character” — so, general category of Mn/nonspacing mark
(like CGJ), I suppose (given the proposed behavior amid combining
marks), rather than Cf/format control?  What would be the combining
class and the impact on normalization?

Another technical concern, reaching far beyond the technical:  What
happens if DSM is used but not supported?  If itʼs a default ignorable
code point (like CGJ and most formatting characters) but supported for
you, you could compose something that renders as ⟨⁧א ← ב⁩⟩ and find it
satisfactory, only for it to render as ⟨⁧א → ב⁩⟩ for your
tech‐deficient readership, possibly fomenting disaster.  If DSM is not
default ignorable and not supported, then your readership may instead
get something like ⟨⁧א →⁠⎕ ב⁩⟩; while the unsupported‐character symbol
is a hint thereʼs something wrong in the rendering (to readers who
recognize it), usually it suggests that thereʼs a glyph missing in its
place, not that a neighboring character is represented by the wrong
glyph (or by a merely questionable glyph, as for the same readers a
left‐to‐right ⟨A → B⟩ might render as ⟨A →⁠⎕ B⟩).  For my money, the
notion that lack of support will not merely obscure meanings (as is to
be expected) but actually invert intended meanings is a fatal flaw.

Despite these misgivings, I am sympathetic to the effort.  I hope this
critique is of some use in the quest for a solution.

On Sun, Dec 28, 2025 at 6:02 AM Nitai Sasson via Unicode
<[email protected]> wrote:
>
> Just repeating this. I think everybody missed this because I sent it during 
> the holiday season.
>
> On Thursday, 12/18/25 at 18:11 Nitai Sasson via Unicode 
> <[email protected]> wrote:
>
> Hello all!
>
> I've been sitting on this for a while, kind of afraid to finish it up and 
> send it. I've finally decided to just do so, even though it's not perfect.
>
> Following the email discussion from April 2025, I want to propose a combining 
> formatting character to affect the mirroring behavior of arrow characters 
> (and potentially other characters) in bidirectional text. The initial idea 
> for this was originally brought up by Mark E. Shoulson while brainstorming in 
> his first reply.
>
> This is a request for comment and a draft for that proposal.
>
> Please see it at:
>
> https://codeberg.org/NeatNit/unicode-bidi-arrows-proposal/src/branch/main/email.md
>
> Thank you,
> Nitai Sasson

Re: RFC: controlling bidirectional mirroring of characters

Reply via email to