Re: Mirroring in Mac OS X (was Mirroring in Unicode)
Dear Behnam, No, this is another story. The sad news is that there are multiple implementations of Unicode in Mac OS X. WebKit (The engine of Safari) has its own Unicode/Bidi engine. Cocoa has its own Unicode with no native Bidi with some ugly Carbon ATSUI patches bolted on and some ICU thrown in to get limited Bidi. Carbon uses an incomplete and degraded implementation of ATSUI which is a downgraded and crippled version of QuickDraw GX layout engine of system 7 days. That is not all. I really hope Apple will start to clean up this extremely ugly mess, otherwise they will be forced out of bidi markets for good. It is amazing how much worse their bidi text engine is compared to 12 years ago. The problem is that each of these have their own bugs. Sometimes the bugs are a result of the same thing being applied twice because of API layering. This is the case with Safari. In some combinations of style sheet and page tags it tends to mirror a glyph twice which will result-in no mirroring which is wrong. Actually the workaround in such case is to use a buggy font which does not have a 'prop' table (like a PC font) and then it will work because it would not be mirrored by the normal mechanism and just WebKit's extra mirroring would create the correct result. I really hope someone at an influential Apple position would listen to me It really frustrates me to see Apple (who once was a pioneer in bidi and was one of the key founders of Unicode) in its current sad position in bidi support. The problems are deep rooted and want a real effort and will in high management positions to solve. - Hooman Mehr On Jun 12, 2004, at 7:51 PM, Behnam wrote: Short of missing something on the list, that would be me providing alternatives to Apple standard keyboards. But they are not "fix" of existing standards. In fact, they are not standard at all! But you are right. This is a minor issue and can be fixed. I can do it for Mac community but I rather ask Apple to do it in its original issue. My concern is more to do with different approaches in dealing with mirroring characters. The point being, it doesn't seem to be the way mirroring characters are mapped on MS keyboards. And most of the web-pages are typed by MS keyboards. Am I on the right track? Behnam On 12-Jun-04, at 10:54 AM, Hooman Mehr wrote: Hi, I checked it and can confirm that Apple's ISIRI 2901 keyboard has a bug in this regard. The Persian opening parenthesis in ISIRI 2901 is located on shit-0 and closing parenthesis on shift-9, but Apple's implementation have them reversed. This is a minor issue. The keyboard file is an XML file that can be easily edited with sys. admin. privileges. I think someone already posted information on a fixed and enhanced Persian Mac OS X keyboard on the list. - Hooman Mehr ___ PersianComputing mailing list [EMAIL PROTECTED] http://lists.sharif.edu/mailman/listinfo/persiancomputing
Re: Mirroring in Unicode
Short of missing something on the list, that would be me providing alternatives to Apple standard keyboards. But they are not "fix" of existing standards. In fact, they are not standard at all! But you are right. This is a minor issue and can be fixed. I can do it for Mac community but I rather ask Apple to do it in its original issue. My concern is more to do with different approaches in dealing with mirroring characters. The point being, it doesn't seem to be the way mirroring characters are mapped on MS keyboards. And most of the web-pages are typed by MS keyboards. Am I on the right track? Behnam On 12-Jun-04, at 10:54 AM, Hooman Mehr wrote: Hi, I checked it and can confirm that Apple's ISIRI 2901 keyboard has a bug in this regard. The Persian opening parenthesis in ISIRI 2901 is located on shit-0 and closing parenthesis on shift-9, but Apple's implementation have them reversed. This is a minor issue. The keyboard file is an XML file that can be easily edited with sys. admin. privileges. I think someone already posted information on a fixed and enhanced Persian Mac OS X keyboard on the list. - Hooman Mehr ___ PersianComputing mailing list [EMAIL PROTECTED] http://lists.sharif.edu/mailman/listinfo/persiancomputing
Re: Mirroring in Unicode
Hi, I checked it and can confirm that Apple's ISIRI 2901 keyboard has a bug in this regard. The Persian opening parenthesis in ISIRI 2901 is located on shit-0 and closing parenthesis on shift-9, but Apple's implementation have them reversed. This is a minor issue. The keyboard file is an XML file that can be easily edited with sys. admin. privileges. I think someone already posted information on a fixed and enhanced Persian Mac OS X keyboard on the list. - Hooman Mehr On Jun 12, 2004, at 6:12 PM, Behnam wrote: On 12-Jun-04, at 8:50 AM, Hooman Mehr wrote: On the other hand, I suspect you have font related issues. read below... This whole thing means that on Mac platform we will see the wrong parenthesis on Persian web-pages forever! Part of the issue you are experiencing could be related to fonts. Persian/Arabic Apple fonts need a suitable character property table to identify mirrored glyphs and behave correctly. Please compare the behavior of Geeza Pro standard system font with the fonts you are using. If they are different it is because of the missing or improperly formed 'prop' table in the font. (http://developer.apple.com/fonts/TTRefMan/RM06/Chap6prop.html) If this is the case let me know to see how I can help fix them. I do all my tests with Geeza Pro and ISIRI keyboard does produce the opposite of intended parenthesis with Geeza Pro. Apple Persian keyboard produces the intended one because as I said it is mapped in the opposite way. My other fonts behave similarly which, I suppose, is good news! Behnam P/S I'm very interested to present this discussion to Apple developer and I'm working on it. ___ PersianComputing mailing list [EMAIL PROTECTED] http://lists.sharif.edu/mailman/listinfo/persiancomputing
Re: Mirroring in Unicode
On 12-Jun-04, at 8:50 AM, Hooman Mehr wrote: On the other hand, I suspect you have font related issues. read below... This whole thing means that on Mac platform we will see the wrong parenthesis on Persian web-pages forever! Part of the issue you are experiencing could be related to fonts. Persian/Arabic Apple fonts need a suitable character property table to identify mirrored glyphs and behave correctly. Please compare the behavior of Geeza Pro standard system font with the fonts you are using. If they are different it is because of the missing or improperly formed 'prop' table in the font. (http://developer.apple.com/fonts/TTRefMan/RM06/Chap6prop.html) If this is the case let me know to see how I can help fix them. I do all my tests with Geeza Pro and ISIRI keyboard does produce the opposite of intended parenthesis with Geeza Pro. Apple Persian keyboard produces the intended one because as I said it is mapped in the opposite way. My other fonts behave similarly which, I suppose, is good news! Behnam P/S I'm very interested to present this discussion to Apple developer and I'm working on it. ___ PersianComputing mailing list [EMAIL PROTECTED] http://lists.sharif.edu/mailman/listinfo/persiancomputing
OT: GNOME/GNU (was Re: Mirroring in Unicode)
> our target system (GNOME/GNU/Linux) GNOME is a GNU project, of course. roozbeh ___ PersianComputing mailing list [EMAIL PROTECTED] http://lists.sharif.edu/mailman/listinfo/persiancomputing
Re: Mirroring in Unicode
On Jun 12, 2004, at 4:14 PM, Behnam wrote: I had discussion with an Apple developer on this subject. She insisted that this is the way Unicode wants the mirroring characters to behave and that Apple has no intention to change its implementation of them. There has been a misunderstanding in your conversation and in a sense both of you are right. As I develop this topic further you'll better understand it. I hope she would read my posts (if she has any influence on Apple) so that something would get fixed on Apple's side as well. On the other hand, what she needs to realize (along with most of the other developers) is: Unicode does not have to dictate the user interface of text input and editing. The user interface of text editing can be vastly improved if we properly design a GUI-optimized model to hide the true underlying Unicode bidi semantics in favor of easier and more user friendly semantics while maintaining 100% Unicode compatibility. On the other hand, I suspect you have font related issues. read below... This whole thing means that on Mac platform we will see the wrong parenthesis on Persian web-pages forever! Part of the issue you are experiencing could be related to fonts. Persian/Arabic Apple fonts need a suitable character property table to identify mirrored glyphs and behave correctly. Please compare the behavior of Geeza Pro standard system font with the fonts you are using. If they are different it is becuase of the missing or improperly formed 'prop' table in the font. (http://developer.apple.com/fonts/TTRefMan/RM06/Chap6prop.html) If this is the case let me know to see how I can help fix them. I guess that along the effort in finding a proper solution for handling of mirroring characters, there has to be an effort to remove this useless mirroring effect in Unicode altogether. Don't even think about that. In the text stream level using logical opening and closing parenthesis instead of visual left and right parenthesis is actually very helpful in keeping the logical text processing model simple and elegant. Also, too many things already depend on it. We need to address this issue in text input/editing services of the operating system without touching Unicode. As I mentioned Unicode is not at fault here. The current assumption that the Unicode model necessarily applies to the user interface is the problem. - Hooman Mehr ___ PersianComputing mailing list [EMAIL PROTECTED] http://lists.sharif.edu/mailman/listinfo/persiancomputing
Re: Mirroring in Unicode
On 12-Jun-04, at 5:35 AM, Hooman Mehr wrote: - The user-friendly solution involves somewhat moving away from abstract concepts and embracing concrete objects. Lets delve deeper: What do you have on your keyboard that identifies a parenthesis? You have just a physical mark, a concrete object for each one. They do not unambiguously refer to either opening or closing parenthesis. Their meaning depends on the current *mode*. This means that Unicode results-in a modal situation without adequate feedback which I hope everybody agrees is undesirable in most circumstances. Compared to Microsoft implementation, Apple Macintosh implements mirroring Unicode characters differently. RTL keyboard layouts of Macintosh, including Persian keyboard actually places the opposite shape of parenthesis or bracket etc. on the keyboard in order to produce the intended shape in RTL mode. This is indeed very confusing. When Apple added Persian ISIRI 2901 to its latest OS, being ISIRI standard, it is implemented exactly as is. As a result, parenthesis on this keyboard produce the opposite of the intended shape in RTL mode. I had discussion with an Apple developer on this subject. She insisted that this is the way Unicode wants the mirroring characters to behave and that Apple has no intention to change its implementation of them. This whole thing means that on Mac platform we will see the wrong parenthesis on Persian web-pages forever! I guess that along the effort in finding a proper solution for handling of mirroring characters, there has to be an effort to remove this useless mirroring effect in Unicode altogether. I know of some Jewish foes that are not too happy about this either! Behnam ___ PersianComputing mailing list [EMAIL PROTECTED] http://lists.sharif.edu/mailman/listinfo/persiancomputing
Re: Mirroring in Unicode
Hi Behdad, I didn't originally notice this part of your post. My apologies. KDE's example is a bad realization of a good idea which causes the idea to be discredited. I have an implementation that have been working for years. [1] My implementation looks more like patching a user hostile assumption in Unicode design [2], but it works flawlessly. KDE's example, does not prove the idea wrong, its implementation is flawed. On the other hand, once you find an implementation that really works, you would never look back. I will share my solution in the Persian GUI spec document and for better or worse it may become the standard behavior. Now that you brought this up, I feel I am obligated to participate actively in its proper implementation to save it from ending up like KDE's. Let me just give some hints about what goes wrong when you try to stay true to Unicode when dealing with text input/edit user interface: - Most average users have trouble handling and using abstract concepts. - Unicode is talking about logical things and abstractions a lot: Opening and closing parenthesis are concepts "(" and ")" are visual concrete objects. For a bi-di text the same closing parenthesis concept may sometimes result "(" and sometimes ")" -- two different objects in the physical world. - The user-friendly solution involves somewhat moving away from abstract concepts and embracing concrete objects. Lets delve deeper: What do you have on your keyboard that identifies a parenthesis? You have just a physical mark, a concrete object for each one. They do not unambiguously refer to either opening or closing parenthesis. Their meaning depends on the current *mode*. This means that Unicode results-in a modal situation without adequate feedback which I hope everybody agrees is undesirable in most circumstances. You can see that if we want to make the bi-di computing more user friendly, we need to architect a mode-less, WYSIWYG user interface for bidi text input/edit. To achieve that, we have no choice but to go against some Unicode principles and replace some abstract concepts with concrete ones in the context of user interface. This does not mean that we have to change or violate Unicode but means that we need to do more work on text input/edit engine besides blindly relying on FriBiDi to create a clean Unicode text stream in the back-end. Please note that this does not mean that Unicode is bad or wrong, but it is not designed to be optimal for Interactive text input/edit. This also does not mean that an optimal text input/edit is impossible with Unicode as the back-end text stream/storage model. - Hooman Mehr [1] I admit that it is working in a controlled environment and is not stress tested. Also, post processing of the text stream can wreck the text stream if it does not observe some rules. Something very hard to enforce on database engines that convert Unicode to some other (usually 8-bit) internal encoding and later convert it back to Unicode. [2] Unicode uses some good principles to create a logically clean text stream while reducing duplicated characters. The actual implementation does not always stay true to the principles which makes the actual Unicode (as it exists today) far uglier than it could have been. The bad news is that some of those principles adversely affect bi-di text in a fundamental way. Unicode has been struggling for years to refine its bi-di handling to the point of today's maturity and Behdad, you have been a great contributor with your FriBiDi and other efforts. But the fact is, those principles are not a natural fit for bi-di text. We can easily see this. Look at the mirrored glyph issue for example. On Jun 12, 2004, at 11:42 AM, Ordak D. Coward wrote: Hi Behdad, On Fri, 11 Jun 2004 05:34:42 -0400, Behdad Esfahbod <[EMAIL PROTECTED]> wrote: Yes this has been the rule for a few years, but everyone is so scared about auto-inserting marks and later dealing with them, without cluttering the text much. One such implementation is KDE's parantheses fixing idea based on keyboard layout which is considered quite a failure (read on Arabeyes wiki page for Qt bugs). I finally figured out that if I insert either an RLE or an LRE character right before each open parenthesis and a PDF character right after each close parenthesis then all parenthesis are matched and also their nesting level is preserved as well. Is this something guranteed, or is that I could not find a bad example where this breaks? Also, is this the KDE's parenthesis fixing idea you are refering to above? -- ODC ___ PersianComputing mailing list [EMAIL PROTECTED] http://lists.sharif.edu/mailman/listinfo/persiancomputing ___ PersianComputing mailing list [EMAIL PROTECTED] http://lists.sharif.edu/mailman/listinfo/persiancomputing
Re: Mirroring in Unicode
Hi Behdad, On Fri, 11 Jun 2004 05:34:42 -0400, Behdad Esfahbod <[EMAIL PROTECTED]> wrote: > > Yes this has been the rule for a few years, but everyone is so > scared about auto-inserting marks and later dealing with them, > without cluttering the text much. One such implementation is > KDE's parantheses fixing idea based on keyboard layout which is > considered quite a failure (read on Arabeyes wiki page for Qt > bugs). I finally figured out that if I insert either an RLE or an LRE character right before each open parenthesis and a PDF character right after each close parenthesis then all parenthesis are matched and also their nesting level is preserved as well. Is this something guranteed, or is that I could not find a bad example where this breaks? Also, is this the KDE's parenthesis fixing idea you are refering to above? -- ODC ___ PersianComputing mailing list [EMAIL PROTECTED] http://lists.sharif.edu/mailman/listinfo/persiancomputing
Re: Mirroring in Unicode
Hi, It is getting more interesting for me, because this is also one one the issues addressed by Persian GUI spec. document I am writing. Unfortunately, many people (including Microsoft) abuse Unicode when writing programs. They don't properly understand and observe bi-di semantics and the choices they make in places that Unicode is either silent or obscure results-in poor implementations. So, the problem is, Unicode specs and reports are not a substitute for good understanding of bi-di semantics, they are just regularizing some aspects of it. I also criticize Unicode organization for not being through enough in pointing out caveats in this regard and correctly giving the big picture. I know what I should do to get correct results because I have already discovered it independently. Unicode is just one way of putting some of that knowledge on paper and specifying certain methods to deal with certain issues without covering all issues. I would have never been able to think of a correct bi-di implementation solely from Unicode documents. So, what Unicode specifies is not wrong, but certainly it is not enough. Since there isn't a good documented source for specifying this kind of nuances in many aspects of handling bi-di text and Arabic Script, we came up with the idea of this Persian GUI spec to clarify these issues and provide guidelines to help developers implement correct Persian software (which includes correct bi-di behavior as a subset along with a lot of other things). If you are really interested in tackling these issues, contact me off list so that we can collaborate further on this. I don't see the list a suitable medium for the discussion because our discussion on this topic will get highly technical and interactive and we will need some diagrams to better illustrate it. So, it will confuse many list members who are not seasoned designers/developers. Just rest assured: The solution is there, clean and conclusive. Developers just need to get it. They can't easily get it (and it may take them years to get it like myself) because of the lack of good documentation. Persian GUI spec is an effort in the direction of clarifying the solutions to these issues. So, I repeat again: I need community support and help to produce something really helpful. Please take note that such an effort is in progress and it is related to a lot of these things, but it is still in early stages of being put on paper. Everything is still mostly in my head, help pull it out on paper in an understandable way. - Hooman Mehr On Jun 11, 2004, at 7:34 AM, Ordak D. Coward wrote: Hi Behdad, I just finished finding the relevant part (Rule L4 of UAX #9) of Unicode specs refering to mirroring. I believe the problem I am complaining about is still a problem and is due to bad Unicode specifications. I do not know how Unicode got mirroring into their standard, and their rationals behind this. However, in my opinion, the correct semantics is that if the input text has matched open and end parenthesis then the visual output should also have matched left and right parenthesis regardless of the paragrpah mode. Obviously the Unicode specs break this semantics when the text is "RTLTEXT(RTLTEXT)" and the paragraph is in LTR mode (or vice versa). While we are talking about the semantics behind BIDI algorithm, I was wondering if BIDI algorithm assigns the same direction to characters regardless of where a line is broken. Which apparenly does not! For example, type in "This a very very long line ÙØØØÛ +-* ÛØ ØØØÛ *-+ this is the question!" in a multiline input area. Notice the visual order of *-+ is the same in both occurneces. Now, insert spaces in the beginning until you get both of the *-+ on the seocnd line. Now observe the difference in ordering of the *-+. I again believe this is a design defect of BIDI specifications. Whereas, it only looks at one line at a time, and does not allow (unless I am mistaken) for state information to be propagated across lines when breaking lines. A better design would have allowed (and required) to pass necessary state information from one line to another such that the visual ordering would have stayed the same regardless of where the lines are broken. Of course, a typical reply could be that I need to insert some control characters to achieve the desired ordering. Then, my rebuttal is that if that is the case, why not make the control characters for such cases mandatory? Anyway, I have no hope of achieving any positive contribution at Unicode consortium (or other big standard groups like that). So, I am going to turn this into something more fruitful. That is, I like to put the burden of correcting these flaws at the UI. Or: "The UI should add control characters at proper places to the user text such that the text renders semantically correct regardless of BIDI inconsistencies" I think satisfying the above requirement is not trivial, but challenging enough to keep a few good minds busy thinking abo
Re: Mirroring in Unicode
On Thu, 10 Jun 2004, Ordak D. Coward wrote: > Hi Behdad, > > I just finished finding the relevant part (Rule L4 of UAX #9) of > Unicode specs refering to mirroring. I believe the problem I am > complaining about is still a problem and is due to bad Unicode > specifications. I do not know how Unicode got mirroring into their > standard, and their rationals behind this. However, in my opinion, the > correct semantics is that if the input text has matched open and end > parenthesis then the visual output should also have matched left and > right parenthesis regardless of the paragrpah mode. Obviously the > Unicode specs break this semantics when the text is "RTLTEXT(RTLTEXT)" > and the paragraph is in LTR mode (or vice versa). I'm sure you agree that matched parantheses is evil in plain text. This breaks all kind of things, like statelessness, context-freeness, locality, etc. It's plain text after all. And assuming no matching should be considered, that's almost the best you can get. Note that in your example the problem is with your paragraph direction, but if you change the spec to work around it you are definitely making worse problems. In this speciall case, you need the second paranthesis that way to work in the more natural "ltrtext(RTLTEXT)" case. > While we are talking about the semantics behind BIDI algorithm, I was > wondering if BIDI algorithm assigns the same direction to characters > regardless of where a line is broken. Which apparenly does not! For > example, type in "This a very very long line ÙØØØÛ +-* ÛØ ØØØÛ *-+ > this is the question!" in a multiline input area. Notice the visual > order of *-+ is the same in both occurneces. Now, insert spaces in the > beginning until you get both of the *-+ on the seocnd line. Now > observe the difference in ordering of the *-+. I again believe this is > a design defect of BIDI specifications. Whereas, it only looks at one > line at a time, and does not allow (unless I am mistaken) for state > information to be propagated across lines when breaking lines. A > better design would have allowed (and required) to pass necessary > state information from one line to another such that the visual > ordering would have stayed the same regardless of where the lines are > broken. No you are wrong here. Bidi does exactly what you expect. It computes this things called "embedding levels" per paragraph, then reorders text in each line based on the computed embedding levels. Note that you are probably using MS products that hardly conform to the Unicode standard. Should you write the output you get that you don't expect/like, I can discuss why it's not that bad. I tried your example in gedit which is using FriBidi 0.10.4 for the bidi engine and it works fine. The "*-+" always looks the same, no matter where the line breaks. > Of course, a typical reply could be that I need to insert some control > characters to achieve the desired ordering. Then, my rebuttal is that > if that is the case, why not make the control characters for such > cases mandatory? Huh? They are mandatory: if you want your specific ordering, you have to insert them. > Anyway, I have no hope of achieving any positive contribution at > Unicode consortium (or other big standard groups like that). So, I am > going to turn this into something more fruitful. That is, I like to > put the burden of correcting these flaws at the UI. Or: In fact Unicode Consertium is very open to suggestions and corrections, but as the bidi expert I tell you, that's almost the best you can get in this logical->visual model. > "The UI should add control characters at proper places to the user > text such that the text renders semantically correct regardless of > BIDI inconsistencies" Yes this has been the rule for a few years, but everyone is so scared about auto-inserting marks and later dealing with them, without cluttering the text much. One such implementation is KDE's parantheses fixing idea based on keyboard layout which is considered quite a failure (read on Arabeyes wiki page for Qt bugs). > I think satisfying the above requirement is not trivial, but > challenging enough to keep a few good minds busy thinking about it. Sure, but the problem is that there many many other easier things that need to be done before we get to there. For example, we're right not trying to fix our target system (GNOME/GNU/Linux) to produce and parse Persian digits. I mentioned this example because this is one of those that is not solved in MS system either. If you are interested in the bidi algorithm, I recommend subscribing to the GNU FriBidi mailing list available from: http://freedesktop.org/Software/FriBidi Cheers, --behdad behdad.org ___ PersianComputing mailing list [EMAIL PROTECTED] http://lists.sharif.edu/mailman/listinfo/persiancomputing
Re: Mirroring in Unicode
Hi Behdad, I just finished finding the relevant part (Rule L4 of UAX #9) of Unicode specs refering to mirroring. I believe the problem I am complaining about is still a problem and is due to bad Unicode specifications. I do not know how Unicode got mirroring into their standard, and their rationals behind this. However, in my opinion, the correct semantics is that if the input text has matched open and end parenthesis then the visual output should also have matched left and right parenthesis regardless of the paragrpah mode. Obviously the Unicode specs break this semantics when the text is "RTLTEXT(RTLTEXT)" and the paragraph is in LTR mode (or vice versa). While we are talking about the semantics behind BIDI algorithm, I was wondering if BIDI algorithm assigns the same direction to characters regardless of where a line is broken. Which apparenly does not! For example, type in "This a very very long line ÙØØØÛ +-* ÛØ ØØØÛ *-+ this is the question!" in a multiline input area. Notice the visual order of *-+ is the same in both occurneces. Now, insert spaces in the beginning until you get both of the *-+ on the seocnd line. Now observe the difference in ordering of the *-+. I again believe this is a design defect of BIDI specifications. Whereas, it only looks at one line at a time, and does not allow (unless I am mistaken) for state information to be propagated across lines when breaking lines. A better design would have allowed (and required) to pass necessary state information from one line to another such that the visual ordering would have stayed the same regardless of where the lines are broken. Of course, a typical reply could be that I need to insert some control characters to achieve the desired ordering. Then, my rebuttal is that if that is the case, why not make the control characters for such cases mandatory? Anyway, I have no hope of achieving any positive contribution at Unicode consortium (or other big standard groups like that). So, I am going to turn this into something more fruitful. That is, I like to put the burden of correcting these flaws at the UI. Or: "The UI should add control characters at proper places to the user text such that the text renders semantically correct regardless of BIDI inconsistencies" I think satisfying the above requirement is not trivial, but challenging enough to keep a few good minds busy thinking about it. On Thu, 10 Jun 2004 21:47:03 -0400, Behdad Esfahbod <[EMAIL PROTECTED]> wrote: > > > Hi Ordak, > > This is not a problem in the Unicode Bidi Algorithm, not even in > Microsoft's implementation of the algorithm. And mirroring seems > to be working quite well. The problem is in the higher level > protocols of your system, which simply does not recognize > right-to-left paragraphs. > > So your "paragraph direction" is left-to-right, and that's why > you see it like that. Microsoft systems have no way of > auto-detecting paragraph directions. In notepad you can set the > whole document direction to rtl or ltr. In MS Word you can set > direction for individual paragraphs. > > GNOME has recently applied a marvelous patch to autodetect > paragraph directions in the most sophisticated way, so we're just > having fun with our text editors ;-). > > behdad > > > > On Thu, 10 Jun 2004, Ordak D. Coward wrote: > > > I noticed that certain mirrored characters appear semanticly wrong on > > my Windows XP machine. I have no idea if it is a problem of Unicode > > BIDI specs or is due to Windows XP imeplementation. I describe the > > problem here, hoping people who know Unicode better pinpoint the > > source of it. > > > > I if type in: "ØØØ (farsi)", that is the sequence T A R SP ( f a r s i ) > > (capital stands for RTL text), the result is RAT (farsi) > > > > However, if I type in "ØØØ (ÙØØØÛ)" that is the sequence T A R SP ( F A R > > S I ) > > the result is ISRAF) RAT) > > > > Obvisouly the parenthesis are wrong in the second example. Now, if > > this is a unicode spec problem, I think they need to fix this. How the > > above text appears on other platforms? > > > > ___ > > PersianComputing mailing list > > [EMAIL PROTECTED] > > http://lists.sharif.edu/mailman/listinfo/persiancomputing > > > > > > --behdad > behdad.org > ___ PersianComputing mailing list [EMAIL PROTECTED] http://lists.sharif.edu/mailman/listinfo/persiancomputing
Re: Mirroring in Unicode
Hi Ordak, This is not a problem in the Unicode Bidi Algorithm, not even in Microsoft's implementation of the algorithm. And mirroring seems to be working quite well. The problem is in the higher level protocols of your system, which simply does not recognize right-to-left paragraphs. So your "paragraph direction" is left-to-right, and that's why you see it like that. Microsoft systems have no way of auto-detecting paragraph directions. In notepad you can set the whole document direction to rtl or ltr. In MS Word you can set direction for individual paragraphs. GNOME has recently applied a marvelous patch to autodetect paragraph directions in the most sophisticated way, so we're just having fun with our text editors ;-). behdad On Thu, 10 Jun 2004, Ordak D. Coward wrote: > I noticed that certain mirrored characters appear semanticly wrong on > my Windows XP machine. I have no idea if it is a problem of Unicode > BIDI specs or is due to Windows XP imeplementation. I describe the > problem here, hoping people who know Unicode better pinpoint the > source of it. > > I if type in: "ØØØ (farsi)", that is the sequence T A R SP ( f a r s i ) > (capital stands for RTL text), the result is RAT (farsi) > > However, if I type in "ØØØ (ÙØØØÛ)" that is the sequence T A R SP ( F A R S > I ) > the result is ISRAF) RAT) > > Obvisouly the parenthesis are wrong in the second example. Now, if > this is a unicode spec problem, I think they need to fix this. How the > above text appears on other platforms? > > ___ > PersianComputing mailing list > [EMAIL PROTECTED] > http://lists.sharif.edu/mailman/listinfo/persiancomputing > > --behdad behdad.org ___ PersianComputing mailing list [EMAIL PROTECTED] http://lists.sharif.edu/mailman/listinfo/persiancomputing
Mirroring in Unicode
I noticed that certain mirrored characters appear semanticly wrong on my Windows XP machine. I have no idea if it is a problem of Unicode BIDI specs or is due to Windows XP imeplementation. I describe the problem here, hoping people who know Unicode better pinpoint the source of it. I if type in: "ØØØ (farsi)", that is the sequence T A R SP ( f a r s i ) (capital stands for RTL text), the result is RAT (farsi) However, if I type in "ØØØ (ÙØØØÛ)" that is the sequence T A R SP ( F A R S I ) the result is ISRAF) RAT) Obvisouly the parenthesis are wrong in the second example. Now, if this is a unicode spec problem, I think they need to fix this. How the above text appears on other platforms? ___ PersianComputing mailing list [EMAIL PROTECTED] http://lists.sharif.edu/mailman/listinfo/persiancomputing