On 04/05/09 13:07, Ali Gholami Rudi wrote:
> Hi Tony,
[...]
>> @Ali: I've seen at least one of the "Farsi" letters, namely "Farsi yeh",
>> used for the Arabic language in a language textbook ("Teach Yourself
>> Arabic", and I haven't remembered the author's name), so I believe your
>> patch would be useful even for the Arabic language. Of course the use of
>> Farsi yeh can be simulated by using "Arabic" yeh in initial and medial
>> positions, and alef maksura in isolated and final positions, but I'd
>> call that workaround "inelegant".
>
> There is another problem. Farsi fonts might not be able to show them.
hm, like Arabic fonts don't always include a "Farsi yeh" glyphset
apparently.
>
> Maybe in theory, the best way of doing it is relying on UnicodeData.txt
> for fetching different forms of a combining letter instead of
> hard-coding it but that is slow and requires including an external file
> so it does not seem to be a good idea.
Yes, me too I believe that relying on an external text file for that
kind of stuff should be avoided if at all possible.
>
> By the way, any ideas for handling NBSP (unicode 0x200C) and ZWJ
> (unicode 0x200D)? Currently Vim shows them as "<200c>" and"<200d>".
> How to hide them?
>
> Regards,
> Ali
>
U2000.pdf from the Unicode site lists them as U+200C ZERO-WIDTH
NON-JOINER and U+200D ZERO-WIDTH JOINER (the NO-BREAK SPACE is U+00A0).
Since they are zero-width characters but not combining characters, some
artifact is necessary so that in Vim (a text editor, not a word
processor) you could see that they're there and add or remove them if
necessary. Vim shows all zero-width Unicode codepoints as <xxxx>, that's
documented somewhere. I don't think you can hide them with plain-vanilla
Vim, but with Vince-Negri's conceal/ownsyntax page (available as an
unofficial patch on the vim_dev Google site) perhaps you could.
When I need to break a word in the middle after an initial or medial
shape (which is rare, but there are two examples on my front page, and I
don't know all the fine points of Arabic Unicode), I may use a tatweel
just before the break. There are other cases where that wouldn't be
practical though; the Arabic dictionary on my table says whether a verb
accepts a "person" or a "thing" as an object by means of the isolated
and/or final shapes of the letter heh standing alone after the verb.
Best regards,
Tony.
--
hundred-and-one symptoms of being an internet addict:
27. You refer to your age as 3.x.
--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_dev" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---