Hi Bram, On Sat, Jun 3, 2023 at 3:48 PM Bram Moolenaar <[email protected]> wrote: > > > Yegappan wrote: > > > I am updating the Vim9 LSP plugin to support various position > > encodings (utf-8, utf-16 and utf-32). > > I ran into a problem with positioning the cursor on a multibyte > > character with composing characters. > > > > The LSP plugin uses the Vim function setcursorcharpos() to position > > the cursor. This function ignores composing characters. The LSP > > server counts the composing characters separately from > > the base character. So when using the character index returned by the > > LSP server to > > position the cursor, the cursor is placed in an incorrect column. > > > > e.g: > > > > void fn(int aVar) > > { > > printf("aVar = %d\n", aVar); > > printf("ŠŠŠŠ = %d\n", aVar); > > printf("áb́áb́ = %d\n", aVar); > > printf("ą́ą́ą́ą́ = %d\n", aVar); > > } > > > > I have tried this test with clangd, pyright and gopls language servers > > and all of them count the > > composing characters as separate characters. > > > > One approach to solve this issue is to add an optional argument to the > > setcursorcharpos() function > > that either counts or ignores composing characters. The default is to > > ignore the composing > > characters. Another approach is to add a function that computes the > > character offset ignoring the composing characters from a character > > offset that includes the composing characters. > > > > Any suggestions? > > Whether to count composing characters separately or not applies to many > functions. Adding a flag to each function to specify how composing > characters are to be handled is going to require a lot of changes. And > even for setcursorcharpos() I don't see a good way to add this flag. > > Assuming we have the text, using a separate function to ignore composing > characters would be a separate step and a universal solution. I suppose > it could be something like: > > idx_without = charpos_dropcomposing({text}, {idx_with}) > > It may not be needed now, but the opposite should be possible: > > idx_with = charpos_addcomposing({text}, {idx_without}) > > Hopefully we can think of better (shorter) names. > > It can possibly already be done with a combination of byteidxcomp() and > charidx(), since these have a choice of counting composing characters or > not. That does require two function calls though. >
Yes. I ended up implementing two helper functions (that you are suggesting above) to convert the character index with and without composing characters: https://github.com/yegappan/lsp/blob/main/autoload/lsp/util.vim#L189 https://github.com/yegappan/lsp/blob/main/autoload/lsp/util.vim#L224 Using these two functions, the Vim9 LSP plugin can now properly support multibyte characters with composing characters. But as you mentioned above, this involves calling two functions (byteidxcomp() and charidx()). I will create a PR to add the two functions you have described above to optimally do this. Regards, Yegappan -- -- You received this message from the "vim_dev" maillist. Do not top-post! Type your reply below the text you are replying to. For more information, visit http://www.vim.org/maillist.php --- You received this message because you are subscribed to the Google Groups "vim_dev" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/vim_dev/CAAW7x7k8hoqphLrUNxNsbpnsXpAMOn%3DF4yZMN-g%3DbgVUtyk3Cw%40mail.gmail.com.
