Re: khaat e Farsi
Hi, Behdad Esfahbod <[EMAIL PROTECTED]> wrote: >Also about the attachment we saw, note that Naskh, Nasta'liq,>Koofi, etc are all different calligraphic styles of the same>Arabic script. So even the attachment saying "khatt-e naskh ...>khatt-e faarsi naam gerefti" is completely non-sense here. You probably mixed the notion of the alphabet and the orthography system. The Arabic alphabet can be adopted by the other languages and even dialects. When other dialects adopt the alphabet and its general rules (connections & RTL), they can adapt those rules in order to fit to their own language needs. This rule adaptation on alphabet is called "khaat". In Persian for instance, we are not able to pronounce all 4 forms of /ze/ (ze, zA, zAd, zAl). We pronounce /zAlem/ with /ze/ not with /zAd/. That's why kids in elementary schools make a lot of mistakes (in our obligatory dictation) in writing words like /tuti/ with /te/ instead of /tA/. As you are aware, Persian language, which is an analytical language, is completely different from the inflectional Arabic language. In Persian you can make a word by adding some affixes which is not possible in Arabic. e.g. the Persian word /nA-tar-AvA-yi/ is equal to the Arabic phrase "lA emkAna qAbeliyata tarashoh/. The Iranians adopted the Arabic alphabet+ its general rules and adapted this rules to their totally different language; however, this became possible only because the origin of Arabic alphabet and the middle Persian alphabet came from the same “ArAmi” system. Even when we borrowed nearly 100,000 words from Arabic after the Tazi invasions, we adapted those Arabic words to fulfill our own language needs. E.g. the word “jAme’e” meaning “university” in Arabic has changed its pronunciation and meaning to “society” in Persian. If you still call this borrowed words Arabic, you are probably wrong because you didn't consider the live essence of language. Language is a live mechanism because it lives and grows with human mind so is the script or writing systems (for more info refer to Noam Chomsky, Language and Mind, 1968). Conclusion: You can say that the origin of our alphabet is Arabic but you can not claim that our writing system is Arabic. Our writing system is Persian “khaat e farsi”. It is what my teacher Dr. Safavi as a linguist says in his book and what I also say as a linguist. Just let me know if more linguists are needed to testify :) however, what linguists believed and struggled to say has been ignored extensively during past years. Dr Bateni proposed a minor change to our writing system long ago in order to better serve the Persian language; and they ignored him and fired him from the Tehran university because of political and religious red lines. Peyman Do you Yahoo!?Friends. Fun. Try the all-new Yahoo! Messenger___ PersianComputing mailing list [EMAIL PROTECTED] http://lists.sharif.edu/mailman/listinfo/persiancomputing
Re: Mirroring in Unicode
Hi Behdad, I just finished finding the relevant part (Rule L4 of UAX #9) of Unicode specs refering to mirroring. I believe the problem I am complaining about is still a problem and is due to bad Unicode specifications. I do not know how Unicode got mirroring into their standard, and their rationals behind this. However, in my opinion, the correct semantics is that if the input text has matched open and end parenthesis then the visual output should also have matched left and right parenthesis regardless of the paragrpah mode. Obviously the Unicode specs break this semantics when the text is "RTLTEXT(RTLTEXT)" and the paragraph is in LTR mode (or vice versa). While we are talking about the semantics behind BIDI algorithm, I was wondering if BIDI algorithm assigns the same direction to characters regardless of where a line is broken. Which apparenly does not! For example, type in "This a very very long line ÙØØØÛ +-* ÛØ ØØØÛ *-+ this is the question!" in a multiline input area. Notice the visual order of *-+ is the same in both occurneces. Now, insert spaces in the beginning until you get both of the *-+ on the seocnd line. Now observe the difference in ordering of the *-+. I again believe this is a design defect of BIDI specifications. Whereas, it only looks at one line at a time, and does not allow (unless I am mistaken) for state information to be propagated across lines when breaking lines. A better design would have allowed (and required) to pass necessary state information from one line to another such that the visual ordering would have stayed the same regardless of where the lines are broken. Of course, a typical reply could be that I need to insert some control characters to achieve the desired ordering. Then, my rebuttal is that if that is the case, why not make the control characters for such cases mandatory? Anyway, I have no hope of achieving any positive contribution at Unicode consortium (or other big standard groups like that). So, I am going to turn this into something more fruitful. That is, I like to put the burden of correcting these flaws at the UI. Or: "The UI should add control characters at proper places to the user text such that the text renders semantically correct regardless of BIDI inconsistencies" I think satisfying the above requirement is not trivial, but challenging enough to keep a few good minds busy thinking about it. On Thu, 10 Jun 2004 21:47:03 -0400, Behdad Esfahbod <[EMAIL PROTECTED]> wrote: > > > Hi Ordak, > > This is not a problem in the Unicode Bidi Algorithm, not even in > Microsoft's implementation of the algorithm. And mirroring seems > to be working quite well. The problem is in the higher level > protocols of your system, which simply does not recognize > right-to-left paragraphs. > > So your "paragraph direction" is left-to-right, and that's why > you see it like that. Microsoft systems have no way of > auto-detecting paragraph directions. In notepad you can set the > whole document direction to rtl or ltr. In MS Word you can set > direction for individual paragraphs. > > GNOME has recently applied a marvelous patch to autodetect > paragraph directions in the most sophisticated way, so we're just > having fun with our text editors ;-). > > behdad > > > > On Thu, 10 Jun 2004, Ordak D. Coward wrote: > > > I noticed that certain mirrored characters appear semanticly wrong on > > my Windows XP machine. I have no idea if it is a problem of Unicode > > BIDI specs or is due to Windows XP imeplementation. I describe the > > problem here, hoping people who know Unicode better pinpoint the > > source of it. > > > > I if type in: "ØØØ (farsi)", that is the sequence T A R SP ( f a r s i ) > > (capital stands for RTL text), the result is RAT (farsi) > > > > However, if I type in "ØØØ (ÙØØØÛ)" that is the sequence T A R SP ( F A R > > S I ) > > the result is ISRAF) RAT) > > > > Obvisouly the parenthesis are wrong in the second example. Now, if > > this is a unicode spec problem, I think they need to fix this. How the > > above text appears on other platforms? > > > > ___ > > PersianComputing mailing list > > [EMAIL PROTECTED] > > http://lists.sharif.edu/mailman/listinfo/persiancomputing > > > > > > --behdad > behdad.org > ___ PersianComputing mailing list [EMAIL PROTECTED] http://lists.sharif.edu/mailman/listinfo/persiancomputing
Re: Mirroring in Unicode
Hi Ordak, This is not a problem in the Unicode Bidi Algorithm, not even in Microsoft's implementation of the algorithm. And mirroring seems to be working quite well. The problem is in the higher level protocols of your system, which simply does not recognize right-to-left paragraphs. So your "paragraph direction" is left-to-right, and that's why you see it like that. Microsoft systems have no way of auto-detecting paragraph directions. In notepad you can set the whole document direction to rtl or ltr. In MS Word you can set direction for individual paragraphs. GNOME has recently applied a marvelous patch to autodetect paragraph directions in the most sophisticated way, so we're just having fun with our text editors ;-). behdad On Thu, 10 Jun 2004, Ordak D. Coward wrote: > I noticed that certain mirrored characters appear semanticly wrong on > my Windows XP machine. I have no idea if it is a problem of Unicode > BIDI specs or is due to Windows XP imeplementation. I describe the > problem here, hoping people who know Unicode better pinpoint the > source of it. > > I if type in: "ØØØ (farsi)", that is the sequence T A R SP ( f a r s i ) > (capital stands for RTL text), the result is RAT (farsi) > > However, if I type in "ØØØ (ÙØØØÛ)" that is the sequence T A R SP ( F A R S > I ) > the result is ISRAF) RAT) > > Obvisouly the parenthesis are wrong in the second example. Now, if > this is a unicode spec problem, I think they need to fix this. How the > above text appears on other platforms? > > ___ > PersianComputing mailing list > [EMAIL PROTECTED] > http://lists.sharif.edu/mailman/listinfo/persiancomputing > > --behdad behdad.org ___ PersianComputing mailing list [EMAIL PROTECTED] http://lists.sharif.edu/mailman/listinfo/persiancomputing
Mirroring in Unicode
I noticed that certain mirrored characters appear semanticly wrong on my Windows XP machine. I have no idea if it is a problem of Unicode BIDI specs or is due to Windows XP imeplementation. I describe the problem here, hoping people who know Unicode better pinpoint the source of it. I if type in: "ØØØ (farsi)", that is the sequence T A R SP ( f a r s i ) (capital stands for RTL text), the result is RAT (farsi) However, if I type in "ØØØ (ÙØØØÛ)" that is the sequence T A R SP ( F A R S I ) the result is ISRAF) RAT) Obvisouly the parenthesis are wrong in the second example. Now, if this is a unicode spec problem, I think they need to fix this. How the above text appears on other platforms? ___ PersianComputing mailing list [EMAIL PROTECTED] http://lists.sharif.edu/mailman/listinfo/persiancomputing
Re: Locale requirement of Persian in Iran, first public draft
I just got this calendar from Iran in the mail: http://students.washington.edu/irina/cal.jpg I guess this orientation is more popular than I thought. I find it too hard to use since I'm used to the more common arrangement (i.e. across the top and then top to bottom) but obviously people do like and prefer this other way. Good thing you included both in the draft! -Connie ___ PersianComputing mailing list [EMAIL PROTECTED] http://lists.sharif.edu/mailman/listinfo/persiancomputing
Designing a bidi text input UI Was: UI problems in editing BiDi texts.
Hooman, I agree. However, I need to clarify a few things. I like a design process that is based on principles. Here are the steps (of a design process that I like): 1) Lay a few underlying design principles. 2) Try to come up with designs that follow (if not too closely) the principles. 3) Implement the designs. 4) Test the implementations. Of course, there are many important issues that are missing from the above picture, mainly the fabric of the design group, or in other words how the design group is shaped and works. For this specific design, here are the main principles in order of importance. - ease of use The design should consider a wide variety of users. e.g. users who are introduced to computers and bidi text input at different levels. At one end we may have people who are using a computer for the first time and cannot read Latin characters; to people who are so used to mo-di (as opposed to bidi) text input who like to see exactly the same semantics at work for bidi text input; to people who have no problem with the current bidi input system(s). The input semantics hence should try to accommodate all these users. The semantics that work for me as a user (versus a designer) are: o Keystrokes should have a visual effect on screen. - I should be able to see what I have typed so far. - I like to see all control characters when typing as well. o Cursor, or in general, visual cues should provide information to me on what happens when I enter certain keys on keyboard. \footnote{A design may implement two cursors when the (visual) position of the next character is not unique} o Arrow keys, I like the arrow keys move in visual order.Even if that means I cannot place the cursor between some characters. o Pointing device should allow me to perform functions I am used to currently, like selection, copy+paste, and moving the text insertion position. o Selection, I like the selections to be continuous looking on screen. In case, the selection looks fragmented, I like to be forced into selecting a bigger part of text that is not fragmented. - implementability The design should consider the limitations of current platforms. For example, the design should assume the input devices are a pointing device and a standard keyboard, and the output device is a small area of the screen. The designer should also consider implementability of the design inside current platforms. - completeness. The design should allow to the user to input all possible bidi texts. On Thu, 10 Jun 2004 10:53:43 +0430, Hooman Mehr <[EMAIL PROTECTED]> wrote: > > Also, note that this issue is one of the single most important issues > we need to solve in order to make using computers as easy for bi-di > users as it is for Roman (or Latin) Script users. > > There is a lot of depth to this issue, don't try to come up with a > quick idea and immediately think that you have solved it. It takes an > expert in human computer interaction design. Someone in the same class > as experts like Jeff Raskin, Bruce "Tog" Tognazzini, Donald A. Norman > or Jakob Nielsen. Even then, we need extensive prototyping and user > testing to refine the solution and select the best alternatives. > > Alright, I know, we don't live in an ideal world (or Iran) and we > really can't expect to go after this issue in a really systematic way, > but lets try to deal with it the best we can. > > - Hooman Mehr > > [1] If you have missed that post, look for "Persian GUI Design > Specifications & Guidelines" in the list archives. > > On Jun 9, 2004, at 3:01 AM, Ordak D. Coward wrote: > > > Please ignore this while I can successfully prepare a long e-mail with > > gmail :( > > > > On Tue, 8 Jun 2004 17:08:53 -0400, Ordak D. Coward <[EMAIL PROTECTED]> > > > > wrote: > >> Following up the old thread, here is my attempt to understand the > >> problem. We may then agree on a desired behavior, and then on an > >> implemenation. > >> > >> The problems appear when typing a text in a BiDi enabled editor. it > >> seems to three categories of concren. > >> > >> 1) When typing a bilingual text, the cursor jumps unexpectedly. An > >> example, is when I type "HERE IS SOME RTL TEXT", (where UPPERCASE > >> stands for RTL characters), in notepad or any input line, the cursor > >> (denoted by |) and text appear as follows: > >> | > >> |EH > >> > > ___ > > PersianComputing mailing list > > [EMAIL PROTECTED] > > http://lists.sharif.edu/mailman/listinfo/persiancomputing > > > > ___ PersianComputing mailing list [EMAIL PROTECTED] http://lists.sharif.edu/mailman/listinfo/persiancomputing
Re: khaat e Farsi
Thanks a lot Hooman for clarification. Also about the attachment we saw, note that Naskh, Nasta'liq, Koofi, etc are all different calligraphic styles of the same Arabic script. So even the attachment saying "khatt-e naskh ... khatt-e faarsi naam gerefti" is completely non-sense here. There are much more important things that define the script, not the number of letter, calligraphic styles, pronounciations, etc. The fact that you can read what's written in those 20 countries without any training, and that there exist situations that you simply can't tell between them, is what matters IMO. And note that it's quite natural that most of us have not ever heard such a grouping before, but all linguists will tell you this is the Arabic (or Perso-Arabic) script. behdad --behdad behdad.org ___ PersianComputing mailing list [EMAIL PROTECTED] http://lists.sharif.edu/mailman/listinfo/persiancomputing
Re: khaat e Farsi
The book can very easily be biased. The sentence "... dastkhosh-e taghiraati besiaar jaaleb shod, ke neshaangar-e aagaahi-e iraaniaan az daanesh-e zabaansheniaasi ast." is far from justified. Don't know why, but it reminds me of the Persian vs. Farsi problem... On Wed, 9 Jun 2004, Peyman wrote: > The attached .jpg is a text from the book "pishineye zabane > farsi" written by Dr. Safavi. > > Peyman --behdad behdad.org ___ PersianComputing mailing list [EMAIL PROTECTED] http://lists.sharif.edu/mailman/listinfo/persiancomputing
Re: UI problems in editing BiDi texts.
While we are at this topic: As I already mentioned, I am working on Persian GUI spec document [1]. One of the sizable portions of this document deals with this topic. If somebody wants to quickly start working on a new and improved implementation, please consult me to share my experience and early draft on the issue. Also, note that this issue is one of the single most important issues we need to solve in order to make using computers as easy for bi-di users as it is for Roman (or Latin) Script users. There is a lot of depth to this issue, don't try to come up with a quick idea and immediately think that you have solved it. It takes an expert in human computer interaction design. Someone in the same class as experts like Jeff Raskin, Bruce "Tog" Tognazzini, Donald A. Norman or Jakob Nielsen. Even then, we need extensive prototyping and user testing to refine the solution and select the best alternatives. Alright, I know, we don't live in an ideal world (or Iran) and we really can't expect to go after this issue in a really systematic way, but lets try to deal with it the best we can. - Hooman Mehr [1] If you have missed that post, look for "Persian GUI Design Specifications & Guidelines" in the list archives. On Jun 9, 2004, at 3:01 AM, Ordak D. Coward wrote: Please ignore this while I can successfully prepare a long e-mail with gmail :( On Tue, 8 Jun 2004 17:08:53 -0400, Ordak D. Coward <[EMAIL PROTECTED]> wrote: Following up the old thread, here is my attempt to understand the problem. We may then agree on a desired behavior, and then on an implemenation. The problems appear when typing a text in a BiDi enabled editor. it seems to three categories of concren. 1) When typing a bilingual text, the cursor jumps unexpectedly. An example, is when I type "HERE IS SOME RTL TEXT", (where UPPERCASE stands for RTL characters), in notepad or any input line, the cursor (denoted by |) and text appear as follows: | |EH ___ PersianComputing mailing list [EMAIL PROTECTED] http://lists.sharif.edu/mailman/listinfo/persiancomputing ___ PersianComputing mailing list [EMAIL PROTECTED] http://lists.sharif.edu/mailman/listinfo/persiancomputing