[emacs-bidi] Re: RTL support

Gregg Reynolds Tue, 22 Nov 2005 20:08:27 -0800

Benjamin Riefenstahl wrote:

Hi Gregg,

Hi Benny,

Thanks for your reasoned reply.  Comments below.


Gregg Reynolds writes:

1.  It was legacy, so Unicode had so support it.  Then they went
   berserk with it.



From my POV, there are very good reasons to consistently encode
characters in the order in which they are written.  You don't want
visual layout for any other operation except display.  You might think
that display is the most important operation on text, but for large
bits of most software it isn't.

Two things. One is, directionality a design choice, not a reflection ofsome kind of objective reality. This is obvious if you stare at someRTL text and think for a while. However, the Unicode book claims thatRTL languages are "inherently" bidirectional. This is hogwash.

Second, "the order in which [characters] are written" is not relevant toan encoding model. There is no necessary relationship between the IOmodel implemented by an application and the corresponding textualrepresentation, which is application independent. Specifically, youreditor can support data entry of digit strings as either LSD-first orMSD-first, or both. Neither data entry protocol has anything to do withthe way the data is encoded in persistent storage. For that matter, theinternal encoding of an editor is independent of the data exchangeformats it im/exports. Emacs being a great example of that.

In other words "reasons to consistently encode characters in the orderin which they are written" is essentially meaningless. (I say that asa statement of fact, not as a flame.)


You might think that RTL without bidi would be enough.  But once you
have RTL, it becomes the job of the Unicode standard to define how
mixed content is handled.  Mixed content is after all the driving
force for Unicode in the first place.  I also think that most users

Hmm. I think that's debatable. I think unification of diverse encodingschemes is the primary driver behind Unicode, but that's a digression.More important is that RTL has no necessary relationship to mixedcontent or bidi reordering. If you only ever write documents in Arabic(Hebrew, Persian, Pashto, whatever) then why do you need bidi? Youdon't; it's an unfortunate artifact of Western-driven standardization.

To be clear: monolingual Arabic text is not mixed content, whether itcontains digit strings or not. So why should an Arabic user pay theUnicode tax of bidi support?

Don't get me wrong, I'm not saying the bidi algorithm is not useful ornice to have. But it's an add-on, not needed by the vast majority ofRTL documents produced in the world. Yes, believe it or not, Arabs andother RTL users actually don't need English, any more than we Englishspeakers need Arabic. To this day, scholarly writings about Arabic inEnglish use transliteration. Arabic is quite capable of the same, evenfor acronyms like IBM or CIA.

It boils down to an economic argument. For Arabic, we need a) RTLlayout (a purely graphical matter); and b) shaping. Both of these are(relatively) inexpensive to implement. Support for bidi reordering is anice enhancement, but it's a) expensive; and b) unecessary unless youwrite in two or more languages in the same doc.

Ask yourself a simple question. Software like Emacs has been around forwhat, 30 years? It gained support for e.g. Japanese, Korean, etc. yearsago. But the 1 billion + people in the world who need RTL support arestill waiting. Why is that? IMHO, it's at least partially because ofthe perceived but false association of RTL and bidi. (I can citespecific examples of vendors declining to support Arabic solely becauseof the expense of implementing bidi support.) The bidi algorithm iscomplex and generally yucky. Thought experiment: imagine a world inwhich nobody would implement English language software unless it hadbidi support.


Sincerely,

-gregg


_______________________________________________
emacs-bidi mailing list
[email protected]
http://lists.gnu.org/mailman/listinfo/emacs-bidi

[emacs-bidi] Re: RTL support

Reply via email to