Yegappan wrote:

> I am updating the Vim9 LSP plugin to support various position
> encodings (utf-8, utf-16 and utf-32).
> I ran into a problem with positioning the cursor on a multibyte
> character with composing characters.
> 
> The LSP plugin uses the Vim function setcursorcharpos() to position
> the cursor.  This function ignores composing characters. The LSP
> server counts the composing characters separately from
> the base character.  So when using the character index returned by the
> LSP server to
> position the cursor, the cursor is placed in an incorrect column.
> 
> e.g:
> 
> void fn(int aVar)
> {
>     printf("aVar = %d\n", aVar);
>     printf("𐟘Š𐟘Š𐟘Š𐟘Š = %d\n", aVar);
>     printf("áb́áb́ = %d\n", aVar);
>     printf("ą́ą́ą́ą́ = %d\n", aVar);
> }
> 
> I have tried this test with clangd, pyright and gopls language servers
> and all of them count the
> composing characters as separate characters.
> 
> One approach to solve this issue is to add an optional argument to the
> setcursorcharpos() function
> that either counts or ignores composing characters. The default is to
> ignore the composing
> characters.  Another approach is to add a function that computes the
> character offset ignoring the composing characters from a character
> offset that includes the composing characters.
> 
> Any suggestions?

Whether to count composing characters separately or not applies to many
functions.  Adding a flag to each function to specify how composing
characters are to be handled is going to require a lot of changes.  And
even for setcursorcharpos() I don't see a good way to add this flag.

Assuming we have the text, using a separate function to ignore composing
characters would be a separate step and a universal solution.  I suppose
it could be something like:

        idx_without = charpos_dropcomposing({text}, {idx_with})

It may not be needed now, but the opposite should be possible:

        idx_with = charpos_addcomposing({text}, {idx_without})

Hopefully we can think of better (shorter) names.

It can possibly already be done with a combination of byteidxcomp() and
charidx(), since these have a choice of counting composing characters or
not.  That does require two function calls though.

-- 
Corduroy pillows: They're making headlines!

 /// Bram Moolenaar -- [email protected] -- http://www.Moolenaar.net   \\\
///                                                                      \\\
\\\        sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ ///
 \\\            help me help AIDS victims -- http://ICCF-Holland.org    ///

-- 
-- 
You received this message from the "vim_dev" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

--- 
You received this message because you are subscribed to the Google Groups 
"vim_dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/vim_dev/20230603224801.CE0581C0595%40moolenaar.net.

Raspunde prin e-mail lui