2015-10-23 15:38 GMT+02:00 Ted Unangst <t...@tedunangst.com>: > Christian Weisgerber wrote: >> Ted Unangst: >> >> > --- ul.c 10 Oct 2015 16:15:03 -0000 1.19 >> > +++ ul.c 23 Oct 2015 10:29:43 -0000 >> > @@ -241,6 +241,8 @@ mfilter(FILE *f) >> > obuf[col].c_mode |= BOLD|mode; >> > else >> > obuf[col].c_mode = mode; >> > + if ((c & (0x80 | 0x40)) == 0x80 && col > 0) >> > + obuf[col].c_mode = obuf[col - 1].c_mode; >> > col++; >> > if (col > maxcol) >> > maxcol = col; >> >> That doesn't quite work. Check out this: >> >> mandoc /usr/share/man/man1/ksh.1 | sed -n 1185,1190p | ul > > so that works with the diff below. i'm not sure how far down this road we need > to travel, but i figure it's worth a little exploration. > > note that i don't think this handles the case of one character, backspace, a > different character correctly, though it can asymptotically approach > correct with some care. > > Index: ul.c > =================================================================== > RCS file: /cvs/src/usr.bin/ul/ul.c,v > retrieving revision 1.19 > diff -u -p -r1.19 ul.c > --- ul.c 10 Oct 2015 16:15:03 -0000 1.19 > +++ ul.c 23 Oct 2015 13:31:45 -0000 > @@ -151,6 +151,12 @@ main(int argc, char *argv[]) > exit(0); > } > > +int > +isu8cont(unsigned char c) > +{ > + return (c & (0x80 | 0x40)) == 0x80; > +} > + > void > mfilter(FILE *f) > { > @@ -158,8 +164,11 @@ mfilter(FILE *f) > > while ((c = getc(f)) != EOF && col < MAXBUF) switch(c) { > case '\b': > - if (col > 0) > + while (col > 0) { > col--; > + if (!isu8cont(obuf[col].c_char)) > + break;
Should this check also be run in case of non-UTF-8 locale (read: "C" one)? > + } > continue; > case '\t': > col = (col+8) & ~07; > @@ -241,6 +250,8 @@ mfilter(FILE *f) > obuf[col].c_mode |= BOLD|mode; > else > obuf[col].c_mode = mode; > + if ((c & (0x80 | 0x40)) == 0x80 && col > 0) > + obuf[col].c_mode = obuf[col - 1].c_mode; > col++; > if (col > maxcol) > maxcol = col; > -- WBR, Vadim Zhukov