Thank you for your reply, Tony. I don't know if my English is enough to make myself clear but I'll try.
In English, semantically, a "word" sequence of characters (a-zA-Z) and is the smallest meaningful unit. Word segmentation is not needed in English because the "word" is naturally separated by whitespaces. The situation is different in CJK languages. It takes several CJK characters to form a "word" but this "word" exists in a serial of characters and is not easily distinguishable for computer. That's why Word Segmentation algorithm is needed to recognize a "word". As far as I know, Vim simply takes a sequence of whatever characters (not ,./?><...) as a "word", which is correct semantically for English, but not for CJK languages. What I want to know is that if Vim has ever thought about adding support to this. Thanks Xie On Jan 21, 9:30 am, Tony Mechelynck <[email protected]> wrote: > On 20/01/09 17:36, Xie wrote: > > > hi everybody > > > Vim is being used around the world, in many different languages. As > > the help indicated, a "word" in Vim is defined as "a sequence of > > letters, digits and underscores ... bla bla bla ...". But that's the > > word for alphabetic languages. Has Vim considered expanding this > > concept to more complex multi-byte languages such as Chinese, Japanese > > or Korean and use some word segmentation algorithm accordingly for > > "w"/"b" etc ? > > > -- > > Xie > > Well, not only "this can be changed" (for single-byte characters) "by > the 'iskeyword' option", but also (for multibyte characters) Vim "knows" > that most characters are "word characters", but that some (such as > U+3000 IDEOGRAPHIC SPACE, U+3001 IDEOGRAPHIC COMMA, U+3002 IDEOGRAPHIC > FULL STOP etc.) are non-word characters. > > What Vim does _not_ do AFAIK is regard every CJK character as a separate > "word". If you want that, you should use the commands for "character > under cursor" etc. rather than "word under cursor" etc. > > Best regards, > Tony. > -- > A lack of leadership is no substitute for inaction. --~--~---------~--~----~------------~-------~--~----~ You received this message from the "vim_dev" maillist. For more information, visit http://www.vim.org/maillist.php -~----------~----~----~----~------~----~------~--~---
