Re: multibyte characters in the Info reader

Eli Zaretskii Thu, 22 Jan 2026 09:10:47 -0800

> From: Gavin Smith <[email protected]>
> Date: Thu, 22 Jan 2026 16:20:11 +0000
> 
> On Thu, Jan 22, 2026 at 01:16:36PM +0100, Patrice Dumas wrote:
> > It seems to me that the column number of multibyte characters, typically
> > for ideograms is not taken into account.  I do not know if it is on
> > purpose, nor if it is easy to do with the current code, but in case it
> > could be useful, here is how it is done in texi2any with the help of
> > libunistring, for a string already UTF-8 encoded:
> > 
> >  uint8_t *u8_text = (uint8_t *) text;
> >  int width = u8_strwidth (u8_text, "UTF-8");
> > 
> >  u8_width could also be used after the number of bytes for an UTF-8
> >  character have been collected.
> 
> Yes as I said, I don't think it is worth it:
> 
> > > Reading the UTF-8 sequence, obtaining the codepoint and calling wcwidth
> > > seems to me to be a unnecessary complication for a marginal use case.
> 
> I don't see why we should add a libunistring dependency and more code to deal
> with this case.
> 
> It is only a problem on particular terminals on MS-Windows (which is an OS
> family of secondary importance for the GNU Project).


If we don't care about moving cursor across wide characters when this
option is in use, then fine.

> The fix I posted seems to work well enough and allows users to read UTF-8
> manuals under such condiitions.

Yes, the display itself is okay, only cursor motion might bump into
problems.

Re: multibyte characters in the Info reader

Reply via email to