On Mon, Apr 23, 2007 at 12:16:29AM +0800, Abel Cheung wrote:
> On 4/17/07, Rich Felker <[EMAIL PROTECTED]> wrote:
> >What is the output of:
> >echo -e '日本語\b\bhello'
> 
> Wait. Quick question: how much should '\b' backstep when wide characters are
> encountered?
> 
> - a whole wide character?
> - a single byte?
> - a half of wide character?

One byte is obviously nonsense since the screen contents are not bytes
but characters. Between the other two options, there's always a
tradeoff: if you want to move by character positions and \b works in
columns or vice versa, then you need to know the width (wcwidth) of
the character you're moving over. However..

> Which is considered 'correct'?

Columns is considered the correct behavior. Otherwise it would be
impossible to position the cursor to a particular visual location
without already knowing the contents of the screen, which a program
might not even know. On the other hand, if you're moving by
characters, then presumably the program knows what the characters on
the screen are, so it can compute widths.

Some terminals (Apple's Terminal.app, I believe) allow you to select
the behavior. This has the benefit of allowing programs which are not
aware of wcwidth to function somewhat usably with wide and/or
nonspacing characters, but at the expense of trashing the column
alignment and visual layout of correct programs. It will also likely
cause serious problems if used with GNU screen, which is width-aware.

One slightly problematic issue is what happens if you position the
cursor 'in the middle' of a double width character and then overwrite
the second column of it. In general the results could be anything
bogus, but good terminals will either erase the character or just
leave half of it there.

uuterm does not yet handle this case, and by chance it will end up
looking for a double-width glyph for the newly written character
(which might exist depending on the font. This behavior of course
should not be relied upon...

Rich

--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Reply via email to