On Mon, 20 May 2002, Zvi Har'El wrote:

> I am using less on a UTF-8 Redhat Linux 7.3 machine. I am having troubles with
> using man, because of the overstiking is not handled properly. I read the
> Unicode HOWTO and compiled less (358) with the patch suggested by
> http://mail.nl.linux.org/linux-utf8/2001-05/msg00023.html
> and the situation improved. However it is not completely OK, as you may easily
....

  I'm afraid the patch you applied introduced the problem you described
while solving the problem of overstriking in UTF-8 mode. BTW, the
patch (as applied by the author of less in less 361) only works for
two-octet-long UTF-8 characters.


> at the beta version of less (377?), but it didn't adress this bug at all

 The patch you refered to seems to have been applied in less 361
according to version.c file.

  Anyway, attached is a *simplistic*(not perfect)  patch against
less 374(the newest at less home page)
I've just made that apparently solves both issues, overstriking of
three-octet-long UTF-8 characters and underlining and overstriking of
two identical US-ASCII characters in a row ('ff' in 'troff', 'tt' in
'pattern'). It's not perfect because it only checks the first
octet of a two or three octet-long UTF-8 char to see if it's
identical with the char. preceding backspace.

  I tested it under UTF-8 xterm and it worked fine with an attached
test case with 'nroff', U+0411, U+2010, and U+AC00, U+4E00 overstruck and
'pattern' underlined.  Underlining doesn't work for UTF-8 characters(other
than US-ASCII), though. However, this is also the case of less-374
without my patch.

   Hope this helps,

   Jungshik Shin
--- line.c.orig Mon May 20 11:56:34 2002
+++ line.c      Mon May 20 12:53:36 2002
@@ -592,12 +592,19 @@
                 * or just deletion of the character in the buffer.
                 */
                overstrike--;
-               if (utf_mode && curr > 1 && (char)c == linebuf[curr-2])
+               if (utf_mode && c & 0x80 && curr > 2 && (char)c == linebuf[curr-3])
                {
                        backc();
                        backc();
+                       backc();
+                       overstrike = 3;
+               } else if (utf_mode && c & 0x80 && curr > 1 && (char)c == 
+linebuf[curr-2])
+               {
+                       backc();
+                       backc();
+                       STORE_CHAR(linebuf[curr], AT_BOLD, pos);
                        overstrike = 2;
-               } else if (utf_mode && curr > 0 && (char)c == linebuf[curr-1])
+               } else if (utf_mode && curr > 0 && c & 0x80 && (char)c == 
+linebuf[curr-1])
                {
                        backc();
                        STORE_CHAR(linebuf[curr], AT_BOLD, pos);
1. nroff
nnrrooffff
nnrrooffffgg ABCD


2. UTF-8 chars : two octet or three otcte long
ББ
‐‐
가가가abbc
一一一가

3. This does not work !! The first octet of a char. following
backspace is the same as the first octet of a char. preceding
backspace, but the subsequent octet is different so that
backspace should erase the char. before it.

가각가abbc
Бӡ

4. pattern : underlined

_p_a_t_t_e_r_n


5. underlining does not work for UTF-8 chars. 
_‐
_Б
_A_B


6. This is the reverse of the common convention(as used by nroff), 
but it works.

‐_
Б_
가_

Reply via email to