Re: Mirroring in Unicode

Hooman Mehr Sat, 12 Jun 2004 02:38:09 -0700

Hi Behdad,

I didn't originally notice this part of your post. My apologies.

KDE's example is a bad realization of a good idea which causes the idea to be discredited. I have an implementation that have been working for years. [1] My implementation looks more like patching a user hostile assumption in Unicode design [2], but it works flawlessly. KDE's example, does not prove the idea wrong, its implementation is flawed.

On the other hand, once you find an implementation that really works, you would never look back. I will share my solution in the Persian GUI spec document and for better or worse it may become the standard behavior. Now that you brought this up, I feel I am obligated to participate actively in its proper implementation to save it from ending up like KDE's.

Let me just give some hints about what goes wrong when you try to stay true to Unicode when dealing with text input/edit user interface:

- Most average users have trouble handling and using abstract concepts. - Unicode is talking about logical things and abstractions a lot: Opening and closing parenthesis are concepts "(" and ")" are visual concrete objects. For a bi-di text the same closing parenthesis concept may sometimes result "(" and sometimes ")" -- two different objects in the physical world. - The user-friendly solution involves somewhat moving away from abstract concepts and embracing concrete objects. Lets delve deeper: What do you have on your keyboard that identifies a parenthesis? You have just a physical mark, a concrete object for each one. They do not unambiguously refer to either opening or closing parenthesis. Their meaning depends on the current *mode*. This means that Unicode results-in a modal situation without adequate feedback which I hope everybody agrees is undesirable in most circumstances.

You can see that if we want to make the bi-di computing more user friendly, we need to architect a mode-less, WYSIWYG user interface for bidi text input/edit. To achieve that, we have no choice but to go against some Unicode principles and replace some abstract concepts with concrete ones in the context of user interface. This does not mean that we have to change or violate Unicode but means that we need to do more work on text input/edit engine besides blindly relying on FriBiDi to create a clean Unicode text stream in the back-end.

Please note that this does not mean that Unicode is bad or wrong, but it is not designed to be optimal for Interactive text input/edit. This also does not mean that an optimal text input/edit is impossible with Unicode as the back-end text stream/storage model.

- Hooman Mehr

[1] I admit that it is working in a controlled environment and is not stress tested. Also, post processing of the text stream can wreck the text stream if it does not observe some rules. Something very hard to enforce on database engines that convert Unicode to some other (usually 8-bit) internal encoding and later convert it back to Unicode.

[2] Unicode uses some good principles to create a logically clean text stream while reducing duplicated characters. The actual implementation does not always stay true to the principles which makes the actual Unicode (as it exists today) far uglier than it could have been. The bad news is that some of those principles adversely affect bi-di text in a fundamental way. Unicode has been struggling for years to refine its bi-di handling to the point of today's maturity and Behdad, you have been a great contributor with your FriBiDi and other efforts. But the fact is, those principles are not a natural fit for bi-di text. We can easily see this. Look at the mirrored glyph issue for example.

On Jun 12, 2004, at 11:42 AM, Ordak D. Coward wrote:

Hi Behdad,

On Fri, 11 Jun 2004 05:34:42 -0400, Behdad Esfahbod
<[EMAIL PROTECTED]> wrote:


Yes this has been the rule for a few years, but everyone is so
scared about auto-inserting marks and later dealing with them,
without cluttering the text much.  One such implementation is
KDE's parantheses fixing idea based on keyboard layout which is
considered quite a failure (read on Arabeyes wiki page for Qt
bugs).


I finally figured out that if I insert either an RLE or an LRE
character right before each open parenthesis and a PDF character right
after each close parenthesis then all parenthesis are matched and also
their nesting level is preserved as well. Is this something guranteed,
or is that I could not find a bad example where this breaks?

Also, is this the KDE's parenthesis fixing idea you are refering to above?

--
ODC
_______________________________________________
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


_______________________________________________
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing

Re: Mirroring in Unicode

Reply via email to