On 03/07/09 20:11, Bram Moolenaar wrote:
> Ken Bloom wrote:
[...]
>> They're stored in the file in "logical order", which is the order that
>> the reader processes them when reading. That means, if he has an English
>> document with some embedded Hebrew, then when he encounters the first
>> Hebrew letter, his eyes will skip to the end of the Hebrew phrase (or the
>> end of the same line if it's a multi-line hebrew phrase), and start
>> working backwords until he hits the English, at which point his eyes will
>> skip again across the Hebrew to the English text that follows the Hebrew.
>> This is "logical order", and it's the order he reads in.
>>
>> It's also the order that a computer would use if it were:
>> * lexicographically comparing mixed-language strings
>> * performing text-to-speech conversion
>> * rewrapping paragraphs
>> * it is the order in which text is typed at the keyboard.
>>
>> See pages 19-20 of the Unicode Standard 5.0 (available online at http://
>> unicode.org/versions/Unicode5.0.0/ch02.pdf)
>>
>> Since display is the only part of the system that doesn't operate in
>> logical order, it's logical to put the conversion into the display
>> routines, rather than putting it into the file itself where it screws up
>> every other operation the computer has to perform on it.
>
> The display is not the only part.  Suppose you move your cursor to the
> start of a word and type "dw".  You expect the word to be deleted.
> Since "start of the word" depends on what direction the word is to be
> read, the editor needs to understand the meaning of the word to be able
> to decide what to do.  And it gets worse: What if some of the characters
> in the word are LTR and some are RTL?  This quickly gets very
> complicated.

With logical order, the start of a word is the letter which stands 
earliest in memory. If you move your cursor to the leading alif of 
Allah, stored in memory logically as ALLH (the second alif is usually 
not written, or only as a diacritical mark above the second lam), then 
do "dw", the word should be deleted until the heh, even though the alif 
is displayed rightmost and the heh leftmost. Memory order is what 
matters, and with logical storage the first letter is still the first 
(though maybe not the leftmost one), not as if you stored Allah as HLLA 
in memory.

As for characters needing reordering within a single word, I suppose 
that's one of the reasons why Vim doesn't yet support devanagari, 
gujarati, and the other Indian-subcontinent scripts of that family.

>
> So Vim uses a simple and reliable method: Display the text either as LTR
> or RTL and do the editing assuming all text is to be read that way.
>
> You can open two windows on the same text, one in LTR and one in RTL if
> you want to edit mixed text.
>
> It would be really messy to display the text with mixed directions and
> then have all edits work one way or perhaps fail with an error.  Or
> worse: delete the wrong text.

IIUC it works correctly in Console mode with mlterm (a true-bidi 
terminal) though in that case h and l will move the cursor in the 
opposite direction when the underlying text is RTL: with my Allah 
example, repeatedly hitting l moves from A to L to L to H which is 
right-to-left but still logically first-to-last.

>
> There are actually many more places where it matters: When
> concatanating two files with text, "echo -n" in the shell, etc.
> That's why i18n is so difficult.

When concatenating files, assuming there is a paragraph break between 
them, logical order gives flawless concatenation in all cases. With 
"presentation order", even with a paragraph break you might have to 
reverse each line of one of the files if they didn't have the same 
direction, and then you would have to somehow know the LTR or RTL 
direction of all three files (both inputs and the output) to begin with.

>
> I'm glad Australians don't write upside-down!
>
>

oh, they do, only they aren't conscious of it. ;-) Happily the mailboat 
(or plane, or even the email transport) reverses it on the way when 
they're writing to us, or we to them.


Best regards,
Tony.

--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_use" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

Reply via email to