2015-11-12 19:17 GMT+03:00 Brian Wilson <[email protected]>:
> I posted the following question on the vi/vim stack exchange
> <http://vi.stackexchange.com/questions/5452/set-line-breaks-word-wraps-and-word-searching-for-thai-and-other-non-latin-lang>
> and was told that the vim-dev mailing list would be a more appropriate
> place to ask.
>
> Brian
>
> It is edited here as best I can with the assumption that the entered text
> is utf-8.
>
> My purpose is for a Thai solution, but instead of a hack, a more general
> solution should be available that will help the more than 1 Billion people
> of the various Indic languages.
> ****
>
> I can set the text width and can manually line break imported paragraphs
> with the following as an example.
>
> set textwidth=72
> gqq
>
> I can also navigate English text files with the standard 'w' 'b' 'e' '*'
> commands, etc.
>
> This works well for English, however Thai and other Brahmic scripts of
> South and South-east Asia space at the phrasal level. Libreoffice, Word,
> Indesign, TeX, etc. "know" where line breaks should occur. They also "know"
> where individual words are, even though there are no spaces. I can navigate
> by Thai word in these programs. And I can even type English, Thai and Lao
> in the chrome address bar and then use alternate arrow on my mac to
> navigate at the word level in all three of these languages. It seems that
> these programs are tapping into work that has already been done at some
> lower level. If vim could tap into the same work, then someone could edit a
> multi-language document without having to do anything fancy. 'w' 'dw'
> (etc.) would just work happily from one word to the next regardless of the
> language.
>
> Line breaking poses a different challenge as these languages space at the
> phrasal level so that the trailing space or absence of a trailing space at
> the end of the line has meaning when breaking and joining lines. For
> purpose of example, the spaces are similar to an oxford comma and other
> punctuation and is the difference of whether or not we had Grandma for
> breakfast. (Let's eat Grandma. vs. Let's eat, Grandma.) One, also, doesn't,
> want, random, spaces, coming, when, they, are, not, needed.
>
> *My question is two fold:* 1. How can vim tap into already available
> libraries in order to recognize words from Indic languages (including and
> especially Thai) for the purpose of navigation and other vim word level
> commands. 2. Is it possible to add language awareness for the purpose of
> line breaking so that vim does not strip/add spaces when breaking/joining
> lines at words in Thai or other Indic languages.
>
`gq` behaviour is by a &formatexpr and &formatprg option values and you
may use them if you know a program which serves your purposes. `w` and
other motions can be remapped, same for `J` (in the last case you may
manually choose between `J` (join with spaces) and `gJ` (join without
inserting spaces, but also without removing them)). So you can have some
minor level of convenience by configuring Vim without patching it. But this
does not work for
1. Motions inside “nore” mappings.
2. expand('<cword>') and other means of getting word under the cursor (e.g.
:edit <cword>).
3. Behaviour when &linebreak option is set.
4. `\<`/`\>`. Though I am unsure that this should be fixed: I always parsed
this as “place between non-word and word character” and “place between word
and non-word character”, and not “place where word starts” and “place where
word ends”. Documentation says about the second interpretation, but I have
a strong impression (based on wording, actual implementation and the fact
that this is my interpretation) that author meant the first variant.
>
>
>
> --
> --
> You received this message from the "vim_dev" maillist.
> Do not top-post! Type your reply below the text you are replying to.
> For more information, visit http://www.vim.org/maillist.php
>
> ---
> You received this message because you are subscribed to the Google Groups
> "vim_dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>
--
--
You received this message from the "vim_dev" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php
---
You received this message because you are subscribed to the Google Groups
"vim_dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.