On Mon, Mar 02, 2020 at 06:25:47PM +0100, Ingo Schwarze wrote: > Hi, > > Marc Chantreux wrote on Mon, Mar 02, 2020 at 11:49:31AM +0100: > > > coming from linux, i'm used to read manpages > > in a vi buffer so i can do much more than > > reading the content. > > I have no idea what the "much more" refers to. The main effect is to > lose tagging functionality. That is, compared to man(1) with the > default pager, you cannot use the :t functionality to move to the > place where a word is defined. > > > i basically use > > > > :r !man ls > > or > > !!sh (when the line content is "man ls") > > Yikes. I had no idea what either of these are doing and had to > try them out. vi(1) contains so much bloat that is never really > needed and doesn't belong in a text editor at all. > > > under openbsd, it seems man doesn't if stdout > > is a tty. > > You mean, man(1) doesn't *imply col -b* if stdout is *not* a tty? > > > i digged the man manual a little bit > > without finding a solution so i worked the > > things around: > > > > :r !man ls|fmt > > As others said, the normal way to strip backspace formatting is > > $ man ls | col -b > > It is documented in man(1) below the -c option and below EXAMPLES, > and in mandoc(1) below "ASCII Output": > > https://man.openbsd.org/man.1#c > https://man.openbsd.org/man.1#EXAMPLES > https://man.openbsd.org/mandoc.1#ASCII_Output > > You find such stuff as follows: > > $ man -k 'Xr=col(1)' > man(1) - display manual pages > mandoc(1) - format manual pages > > The advantage of col(1) over fmt(1) is that it is guaranteed to not > mess up line breaks. > > > now i would like a poor version of keyword > > feature in openbsd vi. the linux version > > > > map K yw:E /tmp/vi.keyword.$$p!!xargs man > > You don't say what that is supposed to do. > > Under Debian Jessie, if i start "vim", then type > > :map K yw:E /tmp/vi.keyword.$$p!!xargs man <ENTER> > als <ESC> > K <ENTER> > > i get: > > Error detected while processing function > netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore..netrw#Explore: > line 30: > E132: Function call depth is higher than 'maxfuncdepth' > Press ENTER or type command to continue > > That doesn't seem useful to me. > > I also tried the same with OpenBSD vi(1) and it resulted in > > Usage: e[dit][!] [+cmd] [file]. > > So, no idea what you are trying to do. > > > becomes > > > > map K yw:E /tmp/vi.keyword.$$p!!xargs -IX sh -c 'man X|fmt' > > > > which doesn't work as | separates 2 vi commands. > > > > i really would like to know one or the two of these: > > > > * is there a way to ask man to deliver pure (non-formatted) text ? > > In 2014, i already wrote a patch to do that because the question > came up repeatedly. But demand wasn't that high after all, so i > never committed it. Now, i updated the patch to -current, see > below. > > On the one hand, the UNIX phlosophy is to have each tool do one > thing well, then use pipes to connect tools as needed. Then again, > arguably, you maybe shouldn't need another tool to just revert > something that the first tool does. Why would *not* adding backspace > formatting require a pipe to another program, rather than not adding > it in the first place? > > Also, the patch that would be required is very small and straightforward. > > So, what do people think? Should i test the patch below in more > depth and commit it? Or do people consider this bloat? > > Yours, > Ingo > > > Index: main.c > =================================================================== > RCS file: /cvs/src/usr.bin/mandoc/main.c,v > retrieving revision 1.247 > diff -u -p -r1.247 main.c > --- main.c 24 Feb 2020 21:15:05 -0000 1.247 > +++ main.c 2 Mar 2020 17:06:53 -0000 > @@ -158,6 +158,7 @@ main(int argc, char *argv[]) > /* Search options. */ > > memset(&conf, 0, sizeof(conf)); > + conf.output.backspace = -1; > conf_file = NULL; > defpaths = auxpaths = NULL; > > @@ -373,6 +374,9 @@ main(int argc, char *argv[]) > return mandoc_msg_getrc(); > } > } > + > + if (conf.output.backspace == -1) > + conf.output.backspace = 1; > > /* Parse arguments. */ > > Index: manconf.h > =================================================================== > RCS file: /cvs/src/usr.bin/mandoc/manconf.h,v > retrieving revision 1.7 > diff -u -p -r1.7 manconf.h > --- manconf.h 22 Nov 2018 11:30:15 -0000 1.7 > +++ manconf.h 2 Mar 2020 17:06:54 -0000 > @@ -1,6 +1,6 @@ > /* $OpenBSD: manconf.h,v 1.7 2018/11/22 11:30:15 schwarze Exp $ */ > /* > - * Copyright (c) 2011, 2015, 2017, 2018 Ingo Schwarze <schwa...@openbsd.org> > + * Copyright (c) 2011,2015,2017,2018,2020 Ingo Schwarze > <schwa...@openbsd.org> > * Copyright (c) 2011 Kristaps Dzonsons <krist...@bsd.lv> > * > * Permission to use, copy, modify, and distribute this software for any > @@ -33,6 +33,7 @@ struct manoutput { > char *tag; > size_t indent; > size_t width; > + int backspace; > int fragment; > int mdoc; > int noval; > Index: mandoc.1 > =================================================================== > RCS file: /cvs/src/usr.bin/mandoc/mandoc.1,v > retrieving revision 1.166 > diff -u -p -r1.166 mandoc.1 > --- mandoc.1 15 Feb 2020 15:28:01 -0000 1.166 > +++ mandoc.1 2 Mar 2020 17:06:55 -0000 > @@ -284,6 +284,13 @@ The following > .Fl O > arguments are accepted: > .Bl -tag -width Ds > +.It Cm format Ns = Ns Cm none > +No back-spaced encoding is used, neither for bold face and underlining > +nor for character overstrikes. Only the last character of each > +overstrike group is printed. > +This has the same effect as piping the output through > +.Xr col 1 > +.Fl bx . > .It Cm indent Ns = Ns Ar indent > The left margin for normal text is set to > .Ar indent > Index: manpath.c > =================================================================== > RCS file: /cvs/src/usr.bin/mandoc/manpath.c,v > retrieving revision 1.28 > diff -u -p -r1.28 manpath.c > --- manpath.c 10 Feb 2020 14:42:03 -0000 1.28 > +++ manpath.c 2 Mar 2020 17:06:57 -0000 > @@ -1,6 +1,6 @@ > /* $OpenBSD: manpath.c,v 1.28 2020/02/10 14:42:03 schwarze Exp $ */ > /* > - * Copyright (c) 2011,2014,2015,2017-2019 Ingo Schwarze > <schwa...@openbsd.org> > + * Copyright (c) 2011,2014,2015,2017-2020 Ingo Schwarze > <schwa...@openbsd.org> > * Copyright (c) 2011 Kristaps Dzonsons <krist...@bsd.lv> > * > * Permission to use, copy, modify, and distribute this software for any > @@ -226,7 +226,7 @@ manconf_output(struct manoutput *conf, c > { > const char *const toks[] = { > "includes", "man", "paper", "style", "indent", "width", > - "tag", "fragment", "mdoc", "noval", "toc" > + "format", "tag", "fragment", "mdoc", "noval", "toc" > }; > const size_t ntoks = sizeof(toks) / sizeof(toks[0]); > > @@ -247,11 +247,11 @@ manconf_output(struct manoutput *conf, c > } > } > > - if (tok < 6 && *cp == '\0') { > + if (tok < 7 && *cp == '\0') { > mandoc_msg(MANDOCERR_BADVAL_MISS, 0, 0, "-O %s=?", toks[tok]); > return -1; > } > - if (tok > 6 && tok < ntoks && *cp != '\0') { > + if (tok > 7 && tok < ntoks && *cp != '\0') { > mandoc_msg(MANDOCERR_BADVAL, 0, 0, "-O %s=%s", toks[tok], cp); > return -1; > } > @@ -308,22 +308,43 @@ manconf_output(struct manoutput *conf, c > "-O width=%s is %s", cp, errstr); > return -1; > case 6: > + switch (conf->backspace) { > + case 0: > + oldval = mandoc_strdup("none"); > + break; > + case 1: > + oldval = mandoc_strdup("backspace"); > + break; > + default: > + if (strcmp(cp, "none") == 0) { > + conf->backspace = 0; > + return 0; > + } else if (strcmp(cp, "backspace") == 0) { > + conf->backspace = 1; > + return 0; > + } > + mandoc_msg(MANDOCERR_BADVAL_BAD, 0, 0, > + "-O format=%s", cp); > + return -1; > + } > + break; > + case 7: > if (conf->tag != NULL) { > oldval = mandoc_strdup(conf->tag); > break; > } > conf->tag = mandoc_strdup(cp); > return 0; > - case 7: > + case 8: > conf->fragment = 1; > return 0; > - case 8: > + case 9: > conf->mdoc = 1; > return 0; > - case 9: > + case 10: > conf->noval = 1; > return 0; > - case 10: > + case 11: > conf->toc = 1; > return 0; > default: > Index: term.c > =================================================================== > RCS file: /cvs/src/usr.bin/mandoc/term.c,v > retrieving revision 1.141 > diff -u -p -r1.141 term.c > --- term.c 3 Jun 2019 20:23:39 -0000 1.141 > +++ term.c 2 Mar 2020 17:07:04 -0000 > @@ -1,7 +1,7 @@ > /* $OpenBSD: term.c,v 1.141 2019/06/03 20:23:39 schwarze Exp $ */ > /* > * Copyright (c) 2008, 2009, 2010, 2011 Kristaps Dzonsons <krist...@bsd.lv> > - * Copyright (c) 2010-2019 Ingo Schwarze <schwa...@openbsd.org> > + * Copyright (c) 2010-2020 Ingo Schwarze <schwa...@openbsd.org> > * > * Permission to use, copy, modify, and distribute this software for any > * purpose with or without fee is hereby granted, provided that the above > @@ -795,24 +795,26 @@ encode1(struct termp *p, int c) > f = (c == ASCII_HYPH || c > 127 || isgraph(c)) ? > p->fontq[p->fonti] : TERMFONT_NONE; > > - if (p->flags & TERMP_BACKBEFORE) { > - if (p->tcol->buf[p->col - 1] == ' ' || > - p->tcol->buf[p->col - 1] == '\t') > - p->col--; > - else > + if (p->backspace) { > + if (p->flags & TERMP_BACKBEFORE) { > + if (p->tcol->buf[p->col - 1] == ' ' || > + p->tcol->buf[p->col - 1] == '\t') > + p->col--; > + else > + p->tcol->buf[p->col++] = '\b'; > + p->flags &= ~TERMP_BACKBEFORE; > + } > + if (f == TERMFONT_UNDER || f == TERMFONT_BI) { > + p->tcol->buf[p->col++] = '_'; > p->tcol->buf[p->col++] = '\b'; > - p->flags &= ~TERMP_BACKBEFORE; > - } > - if (f == TERMFONT_UNDER || f == TERMFONT_BI) { > - p->tcol->buf[p->col++] = '_'; > - p->tcol->buf[p->col++] = '\b'; > - } > - if (f == TERMFONT_BOLD || f == TERMFONT_BI) { > - if (c == ASCII_HYPH) > - p->tcol->buf[p->col++] = '-'; > - else > - p->tcol->buf[p->col++] = c; > - p->tcol->buf[p->col++] = '\b'; > + } > + if (f == TERMFONT_BOLD || f == TERMFONT_BI) { > + if (c == ASCII_HYPH) > + p->tcol->buf[p->col++] = '-'; > + else > + p->tcol->buf[p->col++] = c; > + p->tcol->buf[p->col++] = '\b'; > + } > } > if (p->tcol->lastcol <= p->col || (c != ' ' && c != ASCII_NBRSP)) > p->tcol->buf[p->col] = c; > @@ -839,7 +841,9 @@ encode(struct termp *p, const char *word > adjbuf(p->tcol, p->col + 2 + (sz * 5)); > > for (i = 0; i < sz; i++) { > - if (ASCII_HYPH == word[i] || > + if (p->backspace == 0 && word[i] == '\b') > + p->col--; > + else if (word[i] == ASCII_HYPH || > isgraph((unsigned char)word[i])) > encode1(p, word[i]); > else { > Index: term.h > =================================================================== > RCS file: /cvs/src/usr.bin/mandoc/term.h,v > retrieving revision 1.75 > diff -u -p -r1.75 term.h > --- term.h 4 Jan 2019 03:20:44 -0000 1.75 > +++ term.h 2 Mar 2020 17:07:04 -0000 > @@ -1,7 +1,7 @@ > /* $OpenBSD: term.h,v 1.75 2019/01/04 03:20:44 schwarze Exp $ */ > /* > * Copyright (c) 2008, 2009, 2010, 2011 Kristaps Dzonsons <krist...@bsd.lv> > - * Copyright (c) 2011-2015, 2017, 2019 Ingo Schwarze <schwa...@openbsd.org> > + * Copyright (c) 2011-2015,2017,2019,2020 Ingo Schwarze > <schwa...@openbsd.org> > * > * Permission to use, copy, modify, and distribute this software for any > * purpose with or without fee is hereby granted, provided that the above > @@ -73,6 +73,7 @@ struct termp { > size_t viscol; /* Chars on current line. */ > size_t trailspace; /* See term_flushln(). */ > size_t minbl; /* Minimum blanks before next field. */ > + int backspace; /* Use \b in output. */ > int synopsisonly; /* Print the synopsis only. */ > int mdocstyle; /* Imitate mdoc(7) output. */ > int ti; /* Temporary indent for one line. */ > Index: term_ascii.c > =================================================================== > RCS file: /cvs/src/usr.bin/mandoc/term_ascii.c,v > retrieving revision 1.50 > diff -u -p -r1.50 term_ascii.c > --- term_ascii.c 19 Jul 2019 21:45:37 -0000 1.50 > +++ term_ascii.c 2 Mar 2020 17:07:04 -0000 > @@ -1,7 +1,7 @@ > /* $OpenBSD: term_ascii.c,v 1.50 2019/07/19 21:45:37 schwarze Exp $ */ > /* > * Copyright (c) 2010, 2011 Kristaps Dzonsons <krist...@bsd.lv> > - * Copyright (c) 2014, 2015, 2017, 2018 Ingo Schwarze <schwa...@openbsd.org> > + * Copyright (c) 2014, 2015, 2017-2020 Ingo Schwarze <schwa...@openbsd.org> > * > * Permission to use, copy, modify, and distribute this software for any > * purpose with or without fee is hereby granted, provided that the above > @@ -112,6 +112,8 @@ ascii_init(enum termenc enc, const struc > } > } > > + if (outopts->backspace) > + p->backspace = 1; > if (outopts->mdoc) { > p->mdocstyle = 1; > p->defindent = 5; >
Hi, I wanted to do a similar thing (mandoc to UTF-8 text) and used col -b. I noticed while processing the output of mandoc(1) to ASCII/UTF-8 using col(1) it filters away UTF-8 non-breaking spaces too (\xc2\xa0) for example. To reproduce more simply: OpenBSD: printf 'test\xc2\xa0.\n' | col -b | hexdump -C 00000000 74 65 73 74 2e 0a |test..| util-linux col uses wide-chars and outputs: 00000000 74 65 73 74 c2 a0 2e 0a |test....| On NetBSD and other col implementations there is a -p option. The -p option is specified in an older standard: Technical Standard Commands and Utilities Issue 4, Version 2 page 200 https://pubs.opengroup.org/onlinepubs/9695969399/toc.pdf The below patch adds -p to col (from NetBSD): Patch below: diff --git usr.bin/col/col.1 usr.bin/col/col.1 index cceebfec5db..f0f1e906992 100644 --- usr.bin/col/col.1 +++ usr.bin/col/col.1 @@ -41,7 +41,7 @@ .Nd filter reverse line feeds and backspaces from input .Sh SYNOPSIS .Nm col -.Op Fl bfhx +.Op Fl bfhpx .Op Fl l Ar num .Sh DESCRIPTION .Nm @@ -73,6 +73,12 @@ Buffer at least .Ar num lines in memory. By default, 128 lines are buffered. +.It Fl p +Force unknown control sequences to be passed through unchanged. +Normally, +.Nm +will filter out any control sequences from the input other than those +recognized and interpreted by itself, which are listed below. .It Fl x Output multiple spaces instead of tabs. .El diff --git usr.bin/col/col.c usr.bin/col/col.c index c3c51b4c630..8b59a2f09cf 100644 --- usr.bin/col/col.c +++ usr.bin/col/col.c @@ -92,6 +92,7 @@ int fine; /* if `fine' resolution (half lines) */ int max_bufd_lines; /* max # of half lines to keep in memory */ int nblank_lines; /* # blanks after last flushed line */ int no_backspaces; /* if not to output any backspaces */ +int pass_unknown_seqs; /* whether to pass unknown control sequences */ #define PUTC(ch) \ if (putchar(ch) == EOF) \ @@ -118,7 +119,8 @@ main(int argc, char *argv[]) max_bufd_lines = 256; compress_spaces = 1; /* compress spaces into tabs */ - while ((opt = getopt(argc, argv, "bfhl:x")) != -1) + pass_unknown_seqs = 0; /* remove unknown escape sequences */ + while ((opt = getopt(argc, argv, "bfhl:px")) != -1) switch (opt) { case 'b': /* do not output backspaces */ no_backspaces = 1; @@ -136,6 +138,9 @@ main(int argc, char *argv[]) errx(1, "bad -l argument, %s: %s", errstr, optarg); break; + case 'p': /* pass unknown control sequences */ + pass_unknown_seqs = 1; + break; case 'x': /* do not compress spaces into tabs */ compress_spaces = 0; break; @@ -212,7 +217,8 @@ main(int argc, char *argv[]) addto_lineno(&cur_line, -2); continue; } - continue; + if (!pass_unknown_seqs) + continue; } /* Must stuff ch in a line - are we at the right one? */ @@ -534,7 +540,7 @@ xreallocarray(void *p, size_t n, size_t size) void usage(void) { - (void)fprintf(stderr, "usage: col [-bfhx] [-l num]\n"); + (void)fprintf(stderr, "usage: col [-bfhpx] [-l num]\n"); exit(1); } -- Kind regards, Hiltjo