On Mon, Apr 23, 2007 at 12:16:29AM +0800, Abel Cheung wrote: > On 4/17/07, Rich Felker <[EMAIL PROTECTED]> wrote: > >What is the output of: > >echo -e '日本語\b\bhello' > > Wait. Quick question: how much should '\b' backstep when wide characters are > encountered? > > - a whole wide character? > - a single byte? > - a half of wide character?
One byte is obviously nonsense since the screen contents are not bytes but characters. Between the other two options, there's always a tradeoff: if you want to move by character positions and \b works in columns or vice versa, then you need to know the width (wcwidth) of the character you're moving over. However.. > Which is considered 'correct'? Columns is considered the correct behavior. Otherwise it would be impossible to position the cursor to a particular visual location without already knowing the contents of the screen, which a program might not even know. On the other hand, if you're moving by characters, then presumably the program knows what the characters on the screen are, so it can compute widths. Some terminals (Apple's Terminal.app, I believe) allow you to select the behavior. This has the benefit of allowing programs which are not aware of wcwidth to function somewhat usably with wide and/or nonspacing characters, but at the expense of trashing the column alignment and visual layout of correct programs. It will also likely cause serious problems if used with GNU screen, which is width-aware. One slightly problematic issue is what happens if you position the cursor 'in the middle' of a double width character and then overwrite the second column of it. In general the results could be anything bogus, but good terminals will either erase the character or just leave half of it there. uuterm does not yet handle this case, and by chance it will end up looking for a double-width glyph for the newly written character (which might exist depending on the font. This behavior of course should not be relied upon... Rich -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/