Eli Zaretskii <e...@gnu.org> writes: >> Date: Thu, 24 Feb 2011 14:32:35 +0200 >> From: Eli Osherovich <eli.osherov...@gmail.com> >> >> At the moment (using rev. 103371) I can edit Hebrew/English LaTeX >> documents, however, the way they are displayed in Emacs is not perfect. >> Please look at the file attached as you can see any English text that >> appears inside a Hebrew paragraph requires certain decorations around it >> (e.g., \L{some English text}) these decorations are displayed in an ugly >> fashion. > > Yes, it's a known problem. The Unicode UAX#9 Bidirectional algorithm > (which is what Emacs implements for bidirectional display) does not > produce good results with LaTeX (and with other kinds of markup). > >> Is there anything that can be done about it? > > Something _should_ be done, for sure. But for that, Someone™ should > figure out how this kind of problems could be solved using Emacs > display features. Any solution will probably involve reordering only > parts of text, but a more detailed design suggestion is needed before > it can be implemented. People are welcome to try to tackle this, > because I'm still busy with low-level bidi support of plain text.
I'd like to talk about this problem a little, just to get a little understanding of the problem space. Please be warned that although I have read through UAX#9 a few times, and have been following (as best I can) Eli's bidi work, I am still very much a novice, and am apt to make improper assumptions, or misunderstand how things are supposed to work. In the examples, below, I will use the convention in the UAX#9 document that a capital letter represents an R type character, and a lower-case letter represents an L type character. Formatting codes will be typed as <RLE>, <PDF>, etc. So, the example being used was: Memory: HEBREW \foo{english} Levels: 11111111222222222221 Display: {foo{english\ WERBEH Here the paragraph embedding level is 1 (odd, LtR) since the first character is an R character. The backslash, braces, and spaces are N characters. The N character sequence " \" takes on the current embedding direction (1) based on rule N2. The open brace gets level 2 based on rule N1, and the close brace gets level 1 again based on rule N2. Note that the close brace appears as its mirrored glyph due to rule L4). (Rule N1 states that runs of neutral characters between strong characters of the same direction take on that direction. Rule N2 states that otherwise, they get the embedding direction.) Here is another example: Memory: HEBREW \foo{HEBREW} Levels: 1111111122211111111 Display: {WERBEH}foo\ WERBEH In this case, note that both of the braces are mirrored in the display. One simple, naive way of handling this for the various TeXs is to consider all backslashes and brace characters as R characters. This can be simulated by surrounding each run of these characters by LRE PDF pairs. However, unless TeX ignores these characters completely, these formatting characters would have to be removed before being processed by TeX. Another way of handling this would be to redefine the backslash and brace characters as R characters, for purposes of the display engine. Currently, I don't know if there is a way to do this in elisp. bidi.c seems to use a character table named bidi_type_table to hold this information. Currently this table is not exposed at the elisp layer, to the best of my knowledge. Maybe it would be possible to modify this table in elisp, and possibly make it buffer local? Another idea would be to allow a text property to override the character type. This feels like a very elegant, emacs-ish way to do things, but an uneducated glance at the bidi code makes me feel like it would be difficult to get information about text properties into this layer. Another idea would be to use display strings including the LRE and PDF characters to replace existing backslashes and braces. However, display strings do not affect the bidi algorithm at this point. I'm really starting to ramble at this point, so I think I will send these musings to see what Eli and others think. -- Michael Welsh Duggan (m...@md5i.com) _______________________________________________ emacs-bidi mailing list emacs-bidi@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-bidi