Dominique Pellé <[email protected]> wrote:

> Hi
>
> I found that  :help byteidx()  is rather confusing
> regarding how it treats combining characters.
> It says...
>
>  "Composing characters are counted as a separate character."
>
> I initially interpreted that as combining characters
> are not combined, so counted as separate characters.
> But that is not what byteidx() does with combining chars.
> It actually treats them as a single character.  So the
> help is misleading or badly worded. I only understood
> after experimenting with the attached script which
> shows what byteidx() does with/without combining
> char:
>
> ===
> $ cat byteidx-with-combining-char.vim
> " This script illustrates the behavior of byteidx()
> " with/without combining chars.
>
> " Example of string without using composing chars
> " Code points:    U+002E    U+00E    U+002E
> " utf8 sequences: (0x2e) (0xc3 0xa9) (0x2e)
> let s:a = '.é.'
>
> " Same string but with composing char for the e-acute.
> " Code points:    U+002E  U+0065 + U+0301   U+002E
> " utf8 sequences: (0x2e) (0x65 + 0xcc 0x81) (0x2e)
> let s:b = '.é.'
>
> echo 'Testing without combining char'
> echo [byteidx(s:a, 0), byteidx(s:a, 1), byteidx(s:a, 2), byteidx(s:a,
> 3), byteidx(s:a, 4)]
>
> echo 'Testing with combining char'
> echo [byteidx(s:b, 0), byteidx(s:b, 1), byteidx(s:b, 2), byteidx(s:b,
> 3), byteidx(s:b, 4)]
> ===
>
> : so byteidx-with-combining-char.vim
> Testing without combining char
> [0, 1, 3, 4, -1]
> Testing with combining char
> [0, 1, 4, 5, -1]
>
> Help is also ambiguous about what is meant by
> the "length of string" returned (it's actually a length
> in bytes).
>
> Attached patch makes it clearer I hope.


I precise that the reason I looked at byteidx() with
combining characters is because I tried to fix
the issue with the LanguageTool plugin of Vim
depicted in the screenshot at:

https://github.com/languagetool-org/languagetool/issues/23#issuecomment-26176511

The wrong part of the word is highlighted on the 2nd
word on the screenshot, because LanguageTool
counts columns as characters not combining
characters (same as Java  String.length())
whereas vim byteidx() function combines
characters. So far I have not found a solution
to fix the highlighting in the LanguageTool plugin.

Dominique

-- 
-- 
You received this message from the "vim_dev" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

--- 
You received this message because you are subscribed to the Google Groups 
"vim_dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

Raspunde prin e-mail lui