Markus Kuhn writes:
> In some CJK encoding, a single character can be either
>
> - a byte in the range 0x20-0x7f
> - a byte in the range 0x80-0xff followed by a byte in the range 0x20-0xff
That's BIG5...
> Specify an efficient algorithm that
> determines, how many bytes have to be removed from the end of that
> buffer if the user presses backspace to remove one character.
I had the same problem when making the "tail" program multibyte aware
last week :-) "tail -m N" shall print the last N characters of the
input file. And of course, it should only look at the tail of the
file, especially if it's a very large file.
Bruno
-
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/lists/