Re: [dev] [libgrapheme] Some questions about libgrapheme

atrtarget Fri, 02 Sep 2022 16:05:17 -0700

Hi!

This is a really good suggestion, but I think it may add a lot ofoverheadsince it would need to go through the entire buffer, and since movingthe

cursor is not very frequent (not more than changing you position or

opening a new buffer), I think it would be better to do it the "lazy"way.

However, thanks for pointing out a solution, I guess it would be really
good for some other situations

1. Regarding stepping backwards throught the graphemes:

As Laslo explained, trying to find the starting point of the previous
grapheme is simply not possible.
In your situation, if scanning from the front of the string is too
inefficient for you, you could try keeping

a bitfield in addition to the string, with one bit for each char of thestring.

A 1 in the bitfield means 'this char is the start of a new grapheme',
0 is the opposite.
Every time the string changes, the bitfield is recomputed.
This way, moving the cursor left or right in a text editor is just a
matter of finding the next
or previous set bit in the bitfield, which is extremely cheap.



https://github.com/vim/vim/blob/master/src/libvterm/find-wide-chars.pl
https://github.com/vim/vim/blob/master/src/libvterm/src/fullwidth.inc

I am not 100% sure but it looks like vim goes by the old way. There are
also some comments on this file about it:

https://github.com/vim/vim/blob/master/src/libvterm/src/unicode.c


https://github.com/tmux/tmux/blob/master/utf8.c

tmux seems to go even lazier by using `wcwidth` itself and btw, they
seem to have dropped support for systems who don't support it too:

https://github.com/tmux/tmux/pull/3003


Even neovim seems to use the hack:

https://github.com/neovim/neovim/blob/master/src/unicode/EastAsianWidth.txt

I guess the only robust approach is to render the character on the
terminal, and then read back by how much the
cursor was advanced.


This looks like a good idea, the problem is that I'm not sure if most
terminals will return the actual position in the grid or the number
of graphemes or code points, since it seems like it is not specified
in VT* or in xterm. But as long as this applies to /most/ terminals I
think it's fine, or at least better than wcwidth

2. Regarding the avoidance of terminal linewrap:

AFAIK there's no proper way to query the display width of a character.
It definitely depends on the font though.
I guess the only robust approach is to render the character on the
terminal, and then read back by how much the
cursor was advanced.
So perhaps you could try to render the whole line, detect when a line
overflow happens in the terminal based on
the cursor position, and then react accordingly.
It would be interesting to know how (or even if!) other software such
as tmux or vim has solved this issue.



Thank you a lot for helping me!

Re: [dev] [libgrapheme] Some questions about libgrapheme

Reply via email to