Yegappan wrote:
> > > I am updating the Vim9 LSP plugin to support various position
> > > encodings (utf-8, utf-16 and utf-32).
> > > I ran into a problem with positioning the cursor on a multibyte
> > > character with composing characters.
> > >
> > > The LSP plugin uses the Vim function setcursorcharpos() to position
> > > the cursor. This function ignores composing characters. The LSP
> > > server counts the composing characters separately from
> > > the base character. So when using the character index returned by the
> > > LSP server to
> > > position the cursor, the cursor is placed in an incorrect column.
> > >
> > > e.g:
> > >
> > > void fn(int aVar)
> > > {
> > > printf("aVar = %d\n", aVar);
> > > printf("𐐟˜Š𐐟˜Š𐐟˜Š𐐟˜Š = %d\n", aVar);
> > > printf("áb́áb́ = %d\n", aVar);
> > > printf("ą́ą́ą́ą́ = %d\n", aVar);
> > > }
> > >
> > > I have tried this test with clangd, pyright and gopls language servers
> > > and all of them count the
> > > composing characters as separate characters.
> > >
> > > One approach to solve this issue is to add an optional argument to the
> > > setcursorcharpos() function
> > > that either counts or ignores composing characters. The default is to
> > > ignore the composing
> > > characters. Another approach is to add a function that computes the
> > > character offset ignoring the composing characters from a character
> > > offset that includes the composing characters.
> > >
> > > Any suggestions?
> >
> > Whether to count composing characters separately or not applies to many
> > functions. Adding a flag to each function to specify how composing
> > characters are to be handled is going to require a lot of changes. And
> > even for setcursorcharpos() I don't see a good way to add this flag.
> >
> > Assuming we have the text, using a separate function to ignore composing
> > characters would be a separate step and a universal solution. I suppose
> > it could be something like:
> >
> > idx_without = charpos_dropcomposing({text}, {idx_with})
> >
> > It may not be needed now, but the opposite should be possible:
> >
> > idx_with = charpos_addcomposing({text}, {idx_without})
> >
> > Hopefully we can think of better (shorter) names.
> >
>
> I have created PR https://github.com/vim/vim/pull/12513 to add these
> two new functions. Should we merge these two functions into a single
> function with an argument to specify whether to count or not count
> combining characters?
Thanks for working on this. My main concern at first is that the user
will be confused by seeing three functions:
charidx({string}, {idx} [, {countcc} [, {utf16}]])
charidx_addcc({string}, {idx})
charidx_dropcc({string}, {idx})
Only when reading the details we can find out that the {idx} of
charidx() is a byte index, the other two are character indexes.
Changing the argument name to {byteidx} would help. We may have to do
that for other functions as well, to keep consistency.
Having the {countcc} argument for charidx() and a separate function name
for the other two is confusing. Also because "addcc" and "dropcc" can
be seen as an alternative for {countcc} (and that's not really
incorrect), but there is no hint that the {idx} argument is used
differently.
Alternatively there would be a function that does have the {countcc}
argument and the name indicating that {idx} is a character index:
charidx_XXX({string}, {idx}, {countcc})
However, is this {countcc} argument really doing the same thing? The
help for charidx() says:
When {countcc} is omitted or |FALSE|, then composing characters
are not counted separately, their byte length is added to the
preceding base character.
When {countcc} is |TRUE|, then composing characters are
counted as separate characters.
We can't use exactly the same for charidx_XXX(), since the index is not
in bytes. And using a character index, we would have to mention whether
composing characters are counted separately. This gets confusing, an
argument {countcc} which actually means something else, depending on
whether you look at the input or the result.
It's probably better to use two separate functions. I hope we find
better names though.
The help for the new functions should be extra clear, since it's easy to
misunderstand. We can discuss that on the PR.
--
Drink wet cement and get really stoned.
/// Bram Moolenaar -- [email protected] -- http://www.Moolenaar.net \\\
/// \\\
\\\ sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ ///
\\\ help me help AIDS victims -- http://ICCF-Holland.org ///
--
--
You received this message from the "vim_dev" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php
---
You received this message because you are subscribed to the Google Groups
"vim_dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/vim_dev/20230610132632.2D4D51C0642%40moolenaar.net.