Hi Behdad, 

I just finished finding the relevant part (Rule L4 of UAX #9) of
Unicode specs refering to mirroring. I believe the problem I am
complaining about is still a problem and is due to bad Unicode
specifications. I do not know how Unicode got mirroring into their
standard, and their rationals behind this. However, in my opinion, the
correct semantics is that if the input text has matched open and end
parenthesis then the visual output should also have matched left and
right parenthesis regardless of the paragrpah mode. Obviously the
Unicode specs break this semantics when the text is "RTLTEXT(RTLTEXT)"
and the paragraph is in LTR mode (or vice versa).

While we are talking about the semantics behind BIDI algorithm, I was
wondering if BIDI algorithm assigns the same direction to characters
regardless of where a line is broken. Which apparenly does not! For
example, type in "This a very very long line ÙØØØÛ +-* ÛØ ØØØÛ *-+
this is the question!" in a multiline input area. Notice the visual
order of *-+ is the same in both occurneces. Now, insert spaces in the
beginning until you get both of the *-+ on the seocnd line. Now
observe the difference in ordering of the *-+. I again believe this is
a design defect of BIDI specifications. Whereas, it only looks at one
line at a time, and does not allow (unless I am mistaken) for state
information to be propagated across lines when breaking lines. A
better design would have allowed (and required) to pass necessary
state information from one line to another such that the visual
ordering would have stayed the same regardless of where the lines are
broken.

Of course, a typical reply could be that I need to insert some control
characters to achieve the desired ordering. Then, my rebuttal is that
if that is the case, why not make the control characters for such
cases mandatory?

Anyway, I have no hope of achieving any positive contribution at
Unicode consortium (or other big standard groups like that). So, I am
going to turn this into something more fruitful. That is, I like to
put the burden of correcting these flaws at the UI. Or:

"The UI should add control characters at proper places to the user
text such that the text renders semantically correct regardless of
BIDI inconsistencies"

I think satisfying the above requirement is not trivial, but
challenging enough to keep a few good minds busy thinking about it.


On Thu, 10 Jun 2004 21:47:03 -0400, Behdad Esfahbod
<[EMAIL PROTECTED]> wrote:
> 
> 
> Hi Ordak,
> 
> This is not a problem in the Unicode Bidi Algorithm, not even in
> Microsoft's implementation of the algorithm.  And mirroring seems
> to be working quite well.  The problem is in the higher level
> protocols of your system, which simply does not recognize
> right-to-left paragraphs.
> 
> So your "paragraph direction" is left-to-right, and that's why
> you see it like that.  Microsoft systems have no way of
> auto-detecting paragraph directions.  In notepad you can set the
> whole document direction to rtl or ltr.  In MS Word you can set
> direction for individual paragraphs.
> 
> GNOME has recently applied a marvelous patch to autodetect
> paragraph directions in the most sophisticated way, so we're just
> having fun with our text editors ;-).
> 
> behdad
> 
> 
> 
> On Thu, 10 Jun 2004, Ordak D. Coward wrote:
> 
> > I noticed that certain mirrored characters appear semanticly wrong on
> > my Windows XP machine. I have no idea if it is a problem of Unicode
> > BIDI specs or is due to Windows XP imeplementation. I describe the
> > problem here, hoping people who know Unicode better pinpoint the
> > source of it.
> >
> > I if type in: "ØØØ (farsi)", that is the sequence T A R SP ( f a r s i )
> > (capital stands for RTL text), the result is RAT (farsi)
> >
> > However, if I type in "ØØØ (ÙØØØÛ)" that is the sequence T A R SP ( F A R 
> > S I )
> > the result is  ISRAF) RAT)
> >
> > Obvisouly the parenthesis are wrong in the second example. Now, if
> > this is a unicode spec problem, I think they need to fix this. How the
> > above text appears on other platforms?
> >
> > _______________________________________________
> > PersianComputing mailing list
> > [EMAIL PROTECTED]
> > http://lists.sharif.edu/mailman/listinfo/persiancomputing
> >
> >
> 
> --behdad
>  behdad.org
>

_______________________________________________
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing

Reply via email to