Re: man(7) <-> mdoc(7) (approximate) correspondence table?

2024-05-12 Thread Ingo Schwarze
Hi,

Alejandro Colomar wrote on Sat, Apr 27, 2024 at 10:41:44AM +0200:
> On Sat, Apr 27, 2024 at 04:17:28PM +1000, Alexis wrote:

>> As someone who's much more familiar with mdoc(7) than man(7), is there an
>> approximate 'correspondence table' somewhere that gives at least a rough
>> sense of which man(7) macros to use when, in an mdoc(7) context, one would
>> use a given mdoc(7) macro? Such a table might look something like (to use
>> some obvious probable correspondences):
>> 
>> | mdoc(7) | man(7) | Notes
>> +-++---
>> | Lk  | UR |
>> | Op  | OP |
>> | Sh  | SH |
>> | Ss  | SS |

If you are familiar with the C programming language, you might be able
to use

  https://cvsweb.bsd.lv/~checkout~/mandoc/mdoc_man.c?rev=HEAD

which is a fully automatic mdoc-to-man translator and only 39 kB of code.

Caveat: some tasks are harder to do fully automatically than with the
human mind.  Consequently, that translator for example does not use
the man(7) font macros (like .B and .BR) but uses font escapes instead,
like \fB and \fR.

However, it does produce these, where appropriate:

  HP PD PP RE RS SH SS TE TH TP TS

And the code is ordered according to the mdoc(7) macros,
so you can look up an mdoc(7) macro in the mdoc_man_acts[]
table at the top, then look at what its one, two, or three
handler functions do.  If all three handler functions are NULL,
no man(7) macro is needed, just put the plain text on a text line
in the man(7) file.


> I have similar problems when writing mdoc(7).  What I tend to do is look
> at good (e.g., OpenBSD) mdoc(7) pages' output, and then look at their
> source to see what they use.

Not a bad idea.

In addition, the following alpabetic index may be useful for people
who try to write or maintain mdoc(7) documents:

  https://mandoc.bsd.lv/mdoc/appendix/markup.html

Once you identified a candidate macro in that list, look at

  https://man.openbsd.org/mdoc.7

to learn how to use it.

> I can only recommend you look at pages in the Linux man-pages project,
> and follow what you see (you can ask me if a page is a good reference).
> I try to have them all with perfect source, but there are too many of
> them.

That sounds quite reasonable, too.

Yours,
  Ingo



Re: Why does groff require psutils?

2023-11-26 Thread Ingo Schwarze
Hi,

not related to the "psutils" questions, but this almost made my
eyes fall out.

Alexis wrote on Sun, Nov 26, 2023 at 12:28:25PM +0100:

> Would replacing the following in src/preproc/html/pre-html.cpp
>   s = make_string("psselect -q -p%d %s %s\n",
>pageno, psFileName, psPageName);

WHOA.

What kind of crappy code is that?

It's really "C Programming 101" that you must *never* do anything
like that.  Obviously, execve(2) or a similar library function
that does not suffer from shell argument splitting and shell
metacharacter issues must be used here.  If we want to continue
shipping preproc/html, i think this definitely needs to be fixed.

I mean, for all i know, there are people running "groff -T html"
on public web servers to serve manual pages to the general public
via public CGI interfaces...

Yours,
  Ingo



Re: mandoc -man -Thtml bug: inconsistent vertical space before .TP

2023-11-10 Thread Ingo Schwarze
Hi Branden,

G. Branden Robinson wrote on Sat, Oct 28, 2023 at 02:34:45PM -0500:
> At 2023-10-26T18:37:58+0200, Ingo Schwarze wrote:

>> In particular, when designing a markup language for documentation, i
>> consider it critical to carefully compare the design to HTML, LaTeX,
>> and mdoc(7) before making final decisions, and there may be a few
>> more that might also be worth looking at for comparison.

> I was not attempting to confess ignorance of HTML, but making the
> much simpler statement that I had not previously bothered to attempt
> a mapping from man(7) macro names to HTML elements.

But *that* is exactly the kind of comparison needed when you design
new macros for the man(7) language: in the context of "i have this
novel idea what man(7) should maybe become able to do, and i have a
rough idea for my new macro or for extending an existing macro," the
"compare the design" i mentioned above means to ask: Can HTML, LaTeX,
mdoc(7) do the same thing, and if so, how?  If the answer is "no",
that does not necessarily mean it's a bad idea, but if the answer is
"i never thought about that", that sounds surely disquieteing.

FWIW, Kristaps and myself have not only "attempted" that, but
completed that very task to a state that i would call ready for
production (but certainly not perfect):

  https://cvsweb.bsd.lv/mandoc/man_html.c

That work started in 2009 and was already quite usable in 2010.

The result isn't great, mind you.  It writes lots of  and ,
almost none of which are correct according to HTML 5.  It also
writes a few  and  and  that are rather unexpressive
and only mitigated by class= attributes for CSS purposes.

So the idea is absolutely not new and certainly one among several
aspects to consider when designing or extending man(7) macros.

> [2] That may sound like a surprising statement, or at least one begging
> follow-up.  The idea is this: grohtml attempted to solve the HTML
> translation problem by working _completely generally_ with any valid
> *roff input, which is hopelessly loose and not block-structured.

At least part of what you say in this footnote makes sense to me,
though i don't feel ready to judge whether a project besed on these
ideas could succeed.

But one aspect is certain: the main reason why mandoc(1) HTML output
is so much better than grohtml(1) is exactly what you described: it
restricts itself to two macro sets and preserves macro information
from the input right through to the formatters.

I think the AST approach taken by mandoc(1) is more structured,
simpler, and likely more powerful than your "x" extension command
idea, but that doesn't mean the "x" idea is doomed to failure.

> Worry not.  I'm reading.  The 1,500-page spec of the "Living Document"
> is a bit discouraging in its length, however.

Yes, HTML 5 has grown fat, as most languages do when they are no
longer young.  Still, i find the HTML standard easier to read than
most.  If you want something truly horrifc, try reading the XSLT or
ASN.1 standards.  PDF and CSS are also way worse than HTML.

> Alex Colomar thinks groff_man_style(7) is dauntingly long at 20.

And a certain G. Branden Robinson thinks that mdoc(7) is dauntingly
long at 33.

Don't forget the target audience, though.  We expect the average
programmer not interested in markup to use groff_man_style(7)
and mdoc(7) for their daily side task of documentating their own
program.

I only expect the *designer* of man(7) to consider how their
design compares to HTML, LaTeX etc.  Users of man(7) certainly
need not torment themselves with reading the HTML standard.

>>> LS -> 
>>> TP ->  ... 
>>> LE -> 
> [...]
>> Well, *if* you really want to totally redesign the very foundations
>> of man(7) and change it from almost presentation-only and almost
>> in-line-macro only to the totally different paradigms of semantic
>> markup and block oriented, that is definitely one among the many task
>> involved in redesigning.

> At this point I don't think "totally redesigning" man(7) is necessary,
> either in general or to achieve the specific aim above.  man(7) already
> has no list-structuring macros, so I don't have to delete anything.
> Just add a bit of information that, for anything other than HTML output,
> can be harmlessly ignored anyway.

With "totally redesigning" i mean turning the paradigm upside down.
Even with your latest additions, man(7) is still an almost
exclusively presentational language with an almost exclusively
in-line no-block structure.

What you are aiming for is apparently a mostly semantic language with
mostly block-nesting structure.  How is that not "totally redesigning"?

Yes, there are a few exceptions already, but they feel unsystematic:
These are semantic explicit

.Li in mdoc(7), was: `\c`, mdoc(7), and man(7) extension macros

2023-11-07 Thread Ingo Schwarze
Hi James and Alexis,

Alexis wrote on Mon, Nov 06, 2023 at 11:28:55AM +1100:
> "James K. Lowden"  writes:

>> .Sh SYNOPSIS
>> .Nm 

Since you are asking about style, here is a tiny detail:
In the SYNOPSIS, i recommend not leaving out the .Nm argument,
mostly for the benefit of human readers of the source code,
even though there is no syntactic, semantic, or portability
problem with leaving it out.

See, for example,

  https://mandoc.bsd.lv/mdoc/intro/synopsis_util.html#EXAMPLES

>> .Op Fl D Ns Ar name Ns Oo Li = Ns Ar value Oc
>>
>> The = is not an argument.

That is true.  "Ar name Ns Oo Ar = Ns Ar value Oc"
would be wrong for semantic reasons.

>> It's a literal;

There is mnemonic confusion here.  The term "literal" sounds as if
you intended it to convey semantics, but .Li is a purely presentational
macro and has no semantic value whatsoever.  The macro .Li
means "request a typewriter font":

  https://man.openbsd.org/mdoc.7#Li

When you want a semantic macro that expresses the idea of "literal"
or "this string has to be provided verbatim by the user", use a macro
like .Cm, .Fl, .Ic, or .Fn, or if none of semantic macros fits and
you still want to express "literal, verbatim", use .Sy:

  https://man.openbsd.org/mdoc.7#Sy

In contrast to .Li, .Sy *does* provide some (limited) semantic value.

So being (excessively) strict, one could write

  .Op Fl D Ar name Ns Op Sy = Ns Ar value

because the user has to type the equal sign verbatim, just like the -D.

However, i do not recommend .Sy here.  The usual way is to regard
the equal sign as punctuation and not mark it up at all.

>> that is, it stands for itself.  It separates two arguments.  
>>
>> What is the nondeprecated preferred alternative? 

> i'm certainly interested in Ingo's answer here, but in that 
> specific case, i'd simply leave 'Li' out, as it's not required.

I fully agree.  The standard way to write this is:

  .Op Fl D Ar name Ns Op = Ns Ar value

> (i presume because the 'Oo' has effectively "opened a new formatting 
> context" in which text is literal until a new macro is invoked, 
> but i'd be happy to be corrected.)

Even though i never heard the term "formatting context", i think
you intend to say the right thing here.  Using proper terminology,
i would express your idea as follows:

  .Ar is an in-line macro.  Consequently, its scope only extends
  to the next macro on the same input line or to the end of the
  input line, whichever comes earlier.  That means the font
  inside the inner .Op macro is the same as the font at the
  beginning of the input line, before the .Fl, i.e. the default
  roman font.

Yours,
  Ingo



Re: `\c`, mdoc(7), and man(7) extension macros (was: [PATCH 1/2] man*/: srcfix)

2023-10-26 Thread Ingo Schwarze
Hi Branden and Alejandro,

G. Branden Robinson wrote on Thu, Oct 26, 2023 at 07:58:35AM -0500:
> At 2023-10-25T21:38:59+0200, Alejandro Colomar wrote:
>> On Wed, Oct 25, 2023 at 01:54:24PM -0500, G. Branden Robinson wrote:

>>> diff --git a/man2/open.2 b/man2/open.2
>>> index 4c921723c..6603dfdff 100644
>>> --- a/man2/open.2
>>> +++ b/man2/open.2
>>> @@ -82,8 +82,13 @@ .SH DESCRIPTION
>>>  to an entry in the process's table of open file descriptors.
>>>  The file descriptor is used
>>>  in subsequent system calls
>>> -.RB ( read "(2), " write "(2), " lseek "(2), " fcntl (2),
>>> -etc.) to refer to the open file.
>>> +(\c

>> I'm going to disagree with Ingo with his claim that a macro that
>> forces using \c is bad because it promotes bad style.  '(\c' doesn't
>> look bad to me here.  Not more than having the leading punctuation as
>> an Nth argument.

> I disagree with Ingo on that point as well.
> 
> Saying why leads me to digress; I found myself writing down thoughts
> about future man(7) development more concretely than I have to date.
> (I'll return to this patch at the end.)  So, Ccing the groff list...
> 
> I think Ingo's perspective is strongly influenced by mdoc(7), the use of
> which he strenuously advocates.  And mdoc _does_ manage to make `\c`
> almost(?) totally unnecessary--

I think the sentence is accurate even without the "almost".
I don't recall ever using \c in any mdoc(7) page, and not even
seeing it used by others, and i cannot think of any reason why
anyone should ever want to use it.

> at the cost of a weighty internal recursive macro reprocessing system
> that no other *roff package is known to implement.
> 
> (This is what that "parsed"/"callable" stuff in groff_mdoc(7) (and
> mandoc_mdoc(7)) is all about.  Also, by "weighty", I mean it--back in
> ~1990, when mdoc was implemented, its documentation warned the reader of
> its slowness.  Fortunately, on modern systems, the rendering latency
> relative to man(7) is no longer noticeable.)

This all seems accurate.  With groff or any other full roff(7)
implementation, parsing mdoc(7) is still significantly slower
than parsing man(7) (when regarding the same amount of text output).
Consequently, while mandoc(1) is significantly faster than groff for
both mdoc(7) and man(7), the speed benefit is *much* more pronounced
for mdoc(7) than for man(7).

Your argument that nowadays, sufficient computing power is usually
available such that these performance differences typically no longer
matter for interactive use also makes sense.

> Even with performance considerations out of the picture, I think such a
> system is a point against adoption of mdoc; one can observe that,
> nowadays, both man(7) and mdoc require a person to acquire knowledge
> that they will "never" transfer anywhere else, assuming no resurgence in
> *roff popularity.

Granted.  Then again, trying to find such aspects, i actually found
fewer than expected.  Here are some examples of aspects common to
both languages that aren't actually that unusual:

 * The '.' request/macro line marker is not that different from
   markers used for embedding in-band control information into
   various protocols and encoding schemes.
 * The same applies to the `\` character introducing escape sequences.
 * While \" is a very unusual way of introducing comments,
   it is used in the same way as comment markers in most other
   line-oriented languages.
 * Unescaped whitespace (SPACE and TAB) has syntactical meaning
   and needs escaping in many languages.
 * Protected whitespace ("\ ", \~) exists in many languages.
 * While some of the names of scaling units may be unusual,
   the practice of appending units to numbers is fundamental
   in all sciences.  Besides, scaling units follow traditions
   of classical typography and arguably aren't obsolete even in
   the digital age.
 * Quoting arguments containing whitespace in "" is widespread
   in many languages.
 * Backslash-newline for input line continuation is widespread
   among line-oriented languages, too.
 * Having multiple glyphs for various kinds of hyphens and dashes
   is ubiquitious in typography.
 * Having to pay attention to quotes and accents and their proper
   encoding in input files is ubiquitious in typography, too.

Roff idiosyncracies that are common to mdoc(7) and man(7) and
unlikely to help anywhere else include:

 * Arguably the most unusual feature is \& and its various use cases.
 * Escaping '\' requires \[rs] (GNU syntax) or \e (portable syntax),
   and \\ has a totally different and rather complicated meaning.
 * The same goes for having to use "\&." rather than "\.".
 * That totally blank lines are not ignored in most contexts
   but more or less equivalent to .sp is certainly surprising.
 * Escaping " inside macro arguments by doubling it is quite unusual.

But the most scary idiosyncrasies of roff(7) syntax and semantics
appear only in low-level roff(7) and do not haunt manual page
authors, for example:

 * 

Re: Why does man(7) have 3 paragraph macros for the same thing?

2023-10-26 Thread Ingo Schwarze
Hi Branden and Alejandro,

G. Branden Robinson wrote on Thu, Oct 26, 2023 at 10:28:13AM -0500:
> At 2023-10-26T16:58:13+0200, Alejandro Colomar wrote:
>> On Thu, Oct 26, 2023 at 09:51:40AM -0500, G. Branden Robinson wrote:
>>> At 2023-10-26T16:12:36+0200, Alejandro Colomar wrote:

 Regarding PP, LP, and P, what's the history of them?  Why do we
 have the 3?  I'm willing to reduce them to just one.

>>> Doug's original man(7) (1979) didn't have `P`.  But Unix System III
>>> added it in 1980, and 4.3BSD followed suit in 1986.  This
>>> information is in groff_man(7).

>> Was the original PP?

> It had both `PP` and `LP`.

>> Still, compatibility with ms(7) would make it slightly easier to
>> trasnfer learning from man(7) to ms(7), would one learn it.  I know
>> many other macros are incompatible in bad ways, but the less the
>> better, no?

> That's true, but these days the knowledge transfer is, I submit, vastly
> more likely to go the other way; that is, people will be exposed to
> man(7) as their first roff macro language, and might decide to pick up
> ms(7).
> 
> At that point, they'd have to learn that `LP` and `PP` do _different_
> things.  I think it's actually better if they _don't_ have to unlearn
> the "fact" (applicable only to man(7)) that they are exactly the same.
> 
> Better, I believe, to promote only `P` in man(7).  Anyone wanting to
> pick up mm(7) will still enjoy some knowledge transfer.  Without
> arguments, `P` in mm(7) "does what you mean".

I consider this a bikeshed discussion.

Given that Branden apparently wants to
 * promote .P and deprecate .PP
 * i don't want mandoc_man(7) to gratuitiously spread any more bad
   man(7) style advice than is unavoidable by the fundamental decision
   of declaring the whole man(7) language as obsolete,
i briefly considered changing mandoc_man(7).

Currently it says:

  PP  Begin an undecorated paragraph.  The scope of a paragraph is closed
  by a subsequent paragraph, sub-section, section, or end of file.
  The saved paragraph left-margin width is reset to the default.

  LP  A synonym for PP.

  P   This synonym for PP is an AT System III UNIX extension later
  adopted by 4.3BSD.

and it declares LP and P deprecated by including only PP in the
MACRO OVERVIEW.

All the arguments feel weak in either direction:

 * In theory, .PP is more portable than .P, but that is extremely
   unlikely to ever matter in practice.
 * As seen above, the similarities and subtle differences
   when comparing to ms(7) can be employed as arguments in either
   direction.
 * The arguably more important similarity that HTML defines a 
   but not a  element can be regarded as a learning aid,
   but it's still a weak argument because HTML and roff(7) are
   very different domains and not similar in most other respects.
 * The similarity of .P and  can also be turned around to be
   levied as an argument for .PP:  .P and  are *very different*
   in so far as  is a block element, whereas .P is an in-line
   macro that cannot participate in block nesting.  In particular,
   it can neither nest inside a list item, nor can anything be
   contained inside a .P syntax tree node.  In contrast to ,
   .P does not represent a *paragraph*, but only a paragraph *break*.
 * .PP is more similar to mdoc(7) .Pp.  Again, a weak argument because
   macro naming is totally different in both languages even in most
   of the few cases where functionality matches, with the exception
   of only .SH and .SS.

Consequently, i tend to leave mandoc_man(7) just as it is and not
repaint the bikeshed.  That way, the original .PP macro - with which
nothing is really wrong, except for the fundamental design mistake
of not being a block macro, a mistake it shares with mdoc(7) .Pp -
gets the full description, while the slighly younger .P gets the
compat info, even though that now is only of historical but not
of practical interest.  Maybe still nice to keep both apart - gee,
yet another weak argument.

If, for some reason, you feel strongly about it and think it is
important which one to promote, it might be possible to convince me to
deprecate .PP and list .P as the non-deprecated form even though it
is theoretically less portable.  I must admit i don't particularly
like the idea, though.  It feels like taking a gratuitious risk,
which does not feel ideal even if both the magnitude of the risk
and the benefit reaped are almost exactly zero.

Yours,
  Ingo



Re: mandoc -man -Thtml bug: inconsistent vertical space before .TP

2023-10-26 Thread Ingo Schwarze
Hi Branden and Alejandro,

G. Branden Robinson wrote on Tue, Oct 24, 2023 at 04:54:21AM -0500:
> At 2023-10-24T02:13:34+0200, Ingo Schwarze wrote:

>>  5. On top of all that, i have a hard time to think of any macro
>> that has a more wicked failure mode than .TQ in case the
>> formatter does not support it.  The output visually looks
>> perfectly fine, and the reader gets no hint that the *most
>> important* information is missing.

> I think you're getting carried away here.  `TQ` _takes no arguments_,
> so if a formatter completely ignores it, no text goes missing.

That is completely true, and my point 5 is completely bogus.  This is
embarrassing, in particular since it also tainted the commit message.
I'm not completely sure how the wrong idea entered my head, possibly by
confusing .TP and .IP, which i tend to confuse regularly, even though
the mnemonics isn't actually bad (.TP is the one that can *only*
be reasonably used for lists with a tag, i.e. the one with mandatory
next-line syntax, whereas .IP is the one that can be used for any
indented paragraph, even without a tag).  Not an excuse of course,
i should have looked more closely and not jumped to wrong conclusions.

> Description
>abra   cadabra Newport News.

Yes, that's good enough as a fallback for a few rarely used formatters.

> I think you may have mixed in your own fulminations against the
> new `MR` macro (whose presence in groff is my doing) with your
> apprehension about `TQ`, which predates my participation in groff
> development by several years.

That's another possible factor which might have contributed to my
massive gaffe, yes.

> I'm a bit dubious of `TQ` myself, but it does fulfill, partially,
> a need that mdoc(7) meets with `.Bl -compact`.

Yes, i fully agree with that.
My points 1. to 4. remain valid, but that only means .TQ should not
be used excessively.  The fact that my point 5 was bogus means
that avoiding it almost completely is not required.
Indeed, reasonable use cases seem the same as for .Bl -compact
with seletive .Pp.  The semantic downside of having a list
entry with an empty body instead of having an item header consisting
of two lines is also the same in both languages.  Then again, while
treating an empty item body as an indication for semantic formatters
that the following body has two tags is a burden on formatters, it's
probably not too heavy to bear, so inventing new syntax to avoid that
burden is likely not reasonable.

> If we were to add my proposed list macros, LS/LE to man(7), we could
> de-document `TQ`, as it would be redundant with stacked, bodyless `TP`s
> inside `LS`/`LE` brackets.
> 
> https://lists.gnu.org/archive/html/groff/2022-12/msg00075.html
> 
> I haven't explicitly made the connection to HTML before,

Well, when designing a new language from scratch, you should
always consider prior art, to reduce the risks of reinventing the
wheel, introducing new design mistakes, and leaving design gaps.
In particular, when designing a markup language for documentation, i
consider it critical to carefully compare the design to HTML, LaTeX,
and mdoc(7) before making final decisions, and there may be a few
more that might also be worth looking at for comparison.  Looking at
DocBook is likely *not* a good idea though simply because its design is
so atrocious in so many respects that you will waste massive amounts
of time without learning anything, excpt maybe what not to do.
That doesn't mean the new design must follow the existing languages,
but not even considering HTML 5 when designing a markup language
feels like straightforward negligence to me.  :-(

> but it's pretty easy to imagine this mapping.
> 
> LS -> 
> TP ->  ... 
> LE -> 
> 
> Is that a tradeoff you'd be willing to make?

Well, *if* you really want to totally redesign the very foundations
of man(7) and change it from almost presentation-only and almost
in-line-macro only to the totally different paradigms of semantic
markup and block oriented, that is definitely one among the many task
involved in redesigning.

Yes, structural markup of a list requires saying where the list
starts, where the list ends, and where each item begins.  So that
part of the design of .LS feels right.  If i understand correctly,
the .LS macro will not accept text arguments, which is also good.
The naming seems fair, too.  I did *not* review the proposal though,
so there may be downsides that i am unaware of.  All i'm saying is
that these three points look good, and that the man(7) language indeed
has one of its major weaknesses in this area.

> They could always have done what `TQ` itself does:
> 
> .br
> .ns
> 
> and then
> 
> .TP
> again as usual.
> 
> But that's two low-level requests instead of zero, and one of those
> might take some explaining since it exercises a formatter feat

Re: mandoc -man -Thtml bug: inconsistent vertical space before .TP

2023-10-23 Thread Ingo Schwarze
Hi Alejandro,

Alejandro Colomar wrote on Mon, Oct 23, 2023 at 04:30:58PM +0200:

> This got me curious about TQ, since mandoc(1) considers it "very rarely
> used, even in GNU pages".
> 
> Ingo, you may want to reword that, since TQ was being used in the Linux
> man-pages project,

Done, thanks for the heads up.
I append the resulting commit below.

> and yesterday I wrote a patch to use it even more:
> 

Strange, i pulled from
  https://git.kernel.org/pub/scm/docs/man-pages/man-pages
and don't see such changes there, so i'm just judging from
code inspection right now, without looking at formatted versions.

I think that is a really bad patch.

 1. It gratuitiously makes the description of almost every option
longer by a whole line, which is a significant waste of screen
real estate.  It's further aggravating that due to long options,
most Linux manual pages already have an extra line for the
options at the start of each paragraph.  Now you are
doubling that from one to two wasted lines per option
compared to the same functionality on BSD.  The situation
becomes even more dire because Linux already tends to have
many low-utility options compared to BSDs, so you keep driving
density of useful information down and verbosity and fluff up.

 2. Your argument that this helps searching is a red herring.
The weakness of man(7) in searching has nothing to do with
the *formatting* of list item heads, it is caused by the
lack of *semantic markup*.  There would be no problem creating
search anchors for multiple .Fl topics on the same output line
in mdoc(7).  The only reason i did not do it is because it
is irrelevant for us since we barely have any of those
POSIX-violating long options.

 3. Seperating two synonymous .Fl entries onto different .TP/.TQ
lines weakens semantic expressiveness further.  Even though
mandoc already contains special guessing logic in the HTML
formatter to treat .TQ and following .TP as part of the same
list, other formatters and other output modes may be less smart.
I mean, those are not even the same macros, and yet you hope
for them to be rcognized as entries in the same list?

 4. Even the best HTML markup that is so far feasible results
in *two* list entries, one with an empty body and and only the
second one with the corresponding content.  While that may or may
not look superficially right (depending on the CSS), it certainly
isn't semantically correct and is likely to cause accessibility
grief, for example for blind people.
So, you want HTML formatters (and formatters to other output
languages that support semantic markup) to combine .TQ and
subsequent .TP not only into the same list, but also into the same
list element - but only if the body of the .TQ is empty, i guess?
So you want different macros to behave identically in some
ways (.TQ and .TP part of the same list) but the same macro
fundamentally different depending on its content.  The same
macro sometimes gets its own element and sometimes needs to
fuse into a different macro.
So much fun for implementers of formatting modules!

 5. On top of all that, i have a hard time to think of any macro
that has a more wicked failure mode than .TQ in case the
formatter does not support it.  The output visually looks
perfectly fine, and the reader gets no hint that the *most
important* information is missing.

Actually, part of the reason why i initially added that
additional warning about .TQ was that i felt uneasy about it:
less portable than for example .EX, less important than for
example .UR, no semantic benefit, purely presentational intent,
ad hoc house of cards on the foundation of .TP, which is
already shaky enough in its own right, and a terrible failure
mode.  So at that time in the past, i was quite happy to get
the impression that it was rarely used and stressed that in the
documentation, hoping people would behave reasonably and only
use it in exceptionally dire cases where really nothing else
could possibly help.

Either way, that sentence in our manual page clearly is no longer
true even without that particular patch, so it's gone now.

> TQ seems to be a sibling of TP.  Not sure if this will affect this
> -Thtml bug in some way; my experiments seem good, but they weren't
> exhaustive.

Yes, i believe mandoc -T html is already able to more or less
cope with typical use cases of .TQ.  It seems likely though that
you can construct less typical use cases that render badly to HTML
(or any other semantical markup language for that matter).  You can
be virtually certain that HTML output from any non-mandoc formatter
will be atrocious right now, and any future output to any semantical
markup language will almost certainly be bad initially unless 

Re: GNU groff in articles

2023-10-21 Thread Ingo Schwarze
Hi Jim,

Jim Hall wrote on Fri, Oct 20, 2023 at 11:36:04AM -0500:

> I don't know if you're interested in hearing about articles that
> mention groff, but I wanted to share that I recently hosted a panel
> discussion about technical writing, and GNU groff came up briefly
> in the discussion.

Don't worry, i think what you posted is well on topic on this list.

> We ran this panel as an embedded video inside
> articles on both Technically We Write and Opensource.net:
> 
> https://opensource.net/get-started-with-technical-writing/
> https://technicallywewrite.com/2023/10/20/roundtable

As a quick feedback from my personal point of view:

I would be interested in some of the topics introduced in the two
ledes that you posted as links, but not in all the topics.  However,
while i often use videos for recreational purposes, i generally avoid
using videos to access information in professional contexts because
that's extremely inefficient, a tremendous waste of time.

The same text can be read at several times the speed as being
listened to, and when you are only interested in parts,
skim-reading most of it and studying only the parts you really
care about can be orders of magnitude faster than listening
to a video or podcast.

I find it rather ironic to post a video to celebrate "Writing Day":
It isn't really called "Talking Day", or is it?

Consequently, for lack of accessibility for readers, i cannot comment
on your content in more detail.

Yours,
  Ingo

-- 
Ingo Schwarze 
http://www.openbsd.org/   
http://mandoc.bsd.lv/ 



[bug #64772] [hdtbl] consider deprecating

2023-10-19 Thread Ingo Schwarze
Follow-up Comment #6, bug #64772 (project groff):

[comment #5 comment #5:]

> which implies his previously quoted characterization of the package as
"buggy as hell" is speculative based on code inspection, rather than empirical
based on testing.

Note that code review and black-box testing are both methodologies (among
others that are also useful, depending on the situation) that are actively
being used in the software industry for the purpose of quality assurance, and
i have done both in professional capacities (i.e. being paid for doing such
work).  Obviously, all methodologies have their specific strengths and
weaknesses.  For example, black box testing has the advantage of working even
without access to the source code, but comes at the price of being more
difficult and more time-consuming.  Fuzzing has the advantage of reducing the
human working time needed, but at the price of only finding some types of
issues and finding bugs only in a random manner rather than systematically
per-feature.  Human code review is much easier and faster than automated
testing, in particular for judging the overall code quality - admittedly, it
is hard to make sure that a review found *all* the problems, but that's not
the goal here.

I think calling code review "speculative" in this context - as if systematic
testing were somehow better - is not helpful.  If you already know from code
review that code is of bad quality, starting a systematic testing effort would
be nothing but a waste of time, unless somebody is willing to invest the large
amount of time that is required for cleaning the code up.

When garbarge code is found in the tree and within three years, no one speaks
up who is using it and no one speak up who wants to repair it, how long do we
want to wait before throwing it out?

Isn't it a no-brainer that low-quality unmaintained code should be deleted? 
I'd go as far as saying that should be done even if the code is used by a few
people and even if there is no replacement.  If people want to use garbage
code on an individual basis, that is their individual problem, and they can
still do that if they really want to even after deletion because old versions
remain publicly available, but we should not promote garbage code and
encourage its use by redistributing it.



___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




Re: [htmlxref.cnf] Please update link to the Groff manual

2023-10-05 Thread Ingo Schwarze
Hi Branden,

G. Branden Robinson wrote on Sun, Oct 01, 2023 at 06:53:30PM -0500:

> So while changing the name of the directory back to html_node will fix
> some broken link problems, it won't fix them all, and it won't be robust
> in the face of future development.  I'm fairly neutral on the
> "html_node" vs. "groff.html.node" naming issue, but I'm downright
> _opposed_ to limiting my (or future contributors') flexibility in
> updating, expanding, reducing, or otherwise mutating the node names of
> the groff Texinfo manual.  Those shackles are much too tight.

Agreed.  Of course changing the content of documentation must always
be possible, including removing obsolete content.  Renaming nodes
may occasionally make sense, too.

> A.  Put the groff 1.22.4 manual back online, probably as
> https://www.gnu.org/software/groff/manual/groff-1.22.4/html_node/

While that is unlikely to do much harm, i'm not sure it is needed.
I don't think we encourige using old versions of groff, so it is
unlikely to help normal users.  It may occasionally be useful for
people researching the history of groff, though not all that much
because git serves that purpose better.  It may occasionally
contribute to confusion when search engines return deep links
into old documentation to unsuspecting users.  Not a big deal
either way, i guess.

> ...and have
> https://www.gnu.org/software/groff/manual/html_node/
> symlink/redirect to it.

I don't really like that idea.

Many old web pages talk about groff in general rather than about
specific historical versions of groff.  So being pointed at the
current documentation is likely more useful for most users than
being pointed at documentation for some historical version.
Besides, even if a site talks about a definite version of groff,
that's unlikely to be specifically 1.22.4.

Even if a deep link from an old website dies because the content
of groff documentation changes, i don't think that is necessarly
a bad thing.  It may alert the user following the link that the
underlying functionality of groff in the region the website talks
about has likely evolved.

That doesn't mean links to the top level of the manual should break,
unless we are planning to abandon or rename groff as a whole.  ;-)

Please don't overthink all this.  Keeping links stable is good when
it is easily possible, but it's normal that substantially improving
the content of a website implies that *some* URIs occasionally break,
in particular deep links.

> Okay, I am reminded why the suits hate deep linking.  :-|

I don't think that's the reason.  The suits want visitors of their
company website to see the advertisements of the day on the start page,
both to drive marketing and sales and, as you pointed out, to boost
their personal ego.  They want visitors to use the navigation tools
provided by the website itself such that marketing can effectively
steer visitors to those products that generate the best profit - what
the visitors were actually looking for may sometimes be considered of
secondary importance at best.  Many suits care less about efficient
and reliable access to detailed and technical information.

The general rule "if you care about the reliability of your links,
don't link more deeply than you have good reasons to", on the other
hand, is not limited to suits.  I try to abide by that rule, too.

Yours,
  Ingo



[bug #64619] [mdoc] allow distros to maintain mdoc strings

2023-10-01 Thread Ingo Schwarze
Follow-up Comment #4, bug #64619 (project groff):

Not sure why you interpret

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=273903

as being mad.  That's merely a list of things FreeBSD needs to deal with when
upgrading to groff-1.23, and it happens to be public.  It's normal that a
major version upgrade of software that is actively used requires some work. 
For upgrading groff in OpenBSD ports, i know that the list of issues i will
have to deal with will be *much* longer than that, but i'm not planning to
make that list public.

Regarding

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=273599

that particular (very minor) issue actually goes away if groff follows my
recommendation of no longer validating OS release numbers.

Regarding colonization, there is a summary at

https://mandoc.bsd.lv/ports.html

Most BSDs plus illumos use mandoc as the formatting engine, but for
apropos(1), man(1), and man.cgi(8), which implementation various BSD systems
use varies wildly.  As far as i'm aware, the only systems using mandoc by
default in *all four* capacities are OpenBSD and Void Linux.

One reason why Wolfram Schneider probably cares about this is that he is the
maintainer of the old Perl4 man.cgi(8) that FreeBSD is still using, and which,
i believe, is running groff under the hood:

https://man.freebsd.org/cgi/man.cgi/source




___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




Re: [htmlxref.cnf] Please update link to the Groff manual

2023-10-01 Thread Ingo Schwarze
[dropping the external Cc:s to avoid boring uninvolved parties]

Hi Branden,

G. Branden Robinson wrote on Sat, Sep 30, 2023 at 03:59:13PM -0500:
> At 2023-09-30T22:07:44+0200, Ingo Schwarze wrote:

>>   https://uu.diva-portal.org/smash/get/diva2:1189607/FULLTEXT01.pdf

> This link was of particular interest.  It praised groff 1.22.3's small
> size and high speed, but expressed significant frustration with its
> documentation.

That's very weird because the quality of groff documentation was
already excellent, and way above the average quality of software
documentation, even before you started working on it.  Werner Lemberg
and others did an outstanding job on it.  Groff 1.22.3 documentation
is definitely orders of magnitude better than LaTeX documentation -
and yes, i have worked a lot with LaTeX, including professionally
in academic settings.  LaTeX documentation is scattered all over the
place, almost impossible to search through, of widely varying quality
depending on the component, and the system as a whole is generally
almost impossible to use without refering to non-free sources
like Lamport's books - which i generally remember as aiming for a
loose tutorial-style approach, lacking the completeness, rigour, and
conciseness that you get when you use the groff texinfo manual together
with the relevant manual pages.  Yes, criticising the fragmentation
between the texinfo manual and the manual pages is a valid point, but
a very minor one, given that we are only talking about two sources.
For LaTeX, fragmentation of documentation is much worse.

> I wonder if the author would find the situation in 1.23.0 improved.

I doubt that.  I think the author is simply not used to working with
good documentation, just like he is clearly unexperienced with
software in general.  I mean, blaming groff for lower portability
because Microsoft Windows 10 does not install Perl and Ghostscript
by default?  That's just ridiculous.

The point isn't that you should take this particular person seriously -
to the contrary, you probably shouldn't worry too much about what he says.

The point is that the URI is in use across a wide array of media
from diverse sources.

> Plenty of organizations rotate in new director-level IT or "digital
> presence" managers who decree a change in CMSes just so they can say
> how "impactful" they are on their CVs.

That commercial organizations generally do lots of stupid things that
are not in the public interest (nor in their own interest really)
isn't all that surprising.  As you say, no need to emulate
corporate stupidity in the free software world, right?

> I concede that having a working "/html_node/" URL by hook or by crook
> (or by symlink) is probably a good idea given the list of URLs linking
> to it that you presented above.

Sure, you can keep both URIs indefinitely if you want.

There are samll downsides to having multiple redundant URIs for
the same resource, like higher maintenance effort and more potential
for confusion among users, so i generally try to keep the best URI
and slowly phase out the others (which usually takes many years)
but that's probably not a big deal.

With a URI component as firmly entrenched as /html_node/, phasing out
is likely no longer possible, even if you have a decade to spare for
the transition time, but for the newish /groff.html.node/, phasing
out may still be possible if you care about consistency.

Yours,
  Ingo



Re: [htmlxref.cnf] Please update link to the Groff manual

2023-09-30 Thread Ingo Schwarze
Hi,

Gavin Smith wrote on Sat, Sep 30, 2023 at 08:10:01PM +0100:
> On Sat, Sep 30, 2023 at 01:15:09PM -0500, G. Branden Robinson wrote:

>> 4.  "They might have changed this by mistake."
>> 
>> Sort of.  I find the "html_node" name uglier, but if there's popular
>> demand to switch it (back), I can see doing that for groff 1.24.

> We'll change htmlxref.cnf to whichever URLs you decide to use going forward.
> 
> If there are no links to the groff Texinfo HTML manuals anywhere on the
> web, it doesn't matter, but it is likely there are at least some somewhere.

Now that i see this bumpy ride explained at length, i realize the
link on this overview page of mine is currently broken, too:

  https://mandoc.bsd.lv/links.html

First paragraph, first line, second link ("manual").
Rather prominently featured because, well, groff is important.

Here are a few more links to the established URI, in alphabetical order:

  https://forums.freebsd.org/threads/converting-a-man-page-with-pandoc.36706/
  https://git.pwmt.org/pwmt/zathura/-/issues/258
  https://github.com/asciidoctor/asciidoctor/issues/3992
  https://github.com/jgm/pandoc/issues/5019
  https://lists.defectivebydesign.org/archive/html/groff/2020-10/msg00066.html
  https://lwn.net/Articles/912260/
  https://news.ycombinator.com/item?id=36066812
  https://perldoc.perl.org/Pod::Perldoc::ToMan.txt
  https://unix.stackexchange.com/questions/623970/writing-vietnamese-in-groff
  https://uu.diva-portal.org/smash/get/diva2:1189607/FULLTEXT01.pdf
  https://www.illumos.org/issues/9367
  https://www.reddit.com/r/groff/comments/gbfsx4/page_number_position/
  ...

In general, i think keeping URIs stable makes sense unless there
are *very* strong reasons for changing them - for example, forceful
loss of control over the domain name. or finding out that the old name
violated a relevant standard.  Isn't making a resource available in
the long term at least half the purpose of a URI in the first place,
if not more than half?

Besides, i don't see how a directory name on a public webserver
could possibly be related to internal file naming conventions
in autotools makefiles.

Argumably,

  https://www.gnu.org/software/groff/manual/html_node/

is even better than

  https://www.gnu.org/software/groff/manual/groff.html.node/

because it's more concise (a virtue in URIs) and less
redundant (a vice in URIs) without lacking any information.
The part of the path "groff/manual/groff" doen't really make much sense.

Arguably, even /software/groff/manual/html_node/ is excessively
wordy, but making it more concise and easier to remember is not
a sufficient reason for changing it and breaking links.

Probably it's best to go back to the link that has been in use for
at least a decade - and for extra safety, maybe leave the longer
path in place as an alias for a number of years before checking that
no links to it remain on the web, then deleting the longer alias.

Yours,
  Ingo



[bug #64619] [mdoc] allow distros to maintain mdoc strings

2023-09-27 Thread Ingo Schwarze
Follow-up Comment #2, bug #64619 (project groff):

This is a bad idea.  It would cause manual pages to become non-portable.

FWIW, i agree with Branden that the string system is among the more poorly
designed parts of the mdoc(7) language.  Not all of that can be fixed.  In
particular, the existing .St strings have to remain.  But defining new strings
should generally be discouraged.  In the case of .St, it should be limited to
extremely important standards where there can be no doubt that they will be
widely needed across almost all projects.

Regarding hard-coded version numbers, that was a non-starter which turned into
a nightmarish travesty decades ago.  I believe groff ought to simply stop
validating Nx, Fx, Ox, and Dx arguments like mandoc(1) did years ago.

I have no good solution for .Lb.  It's misdesigned from the ground up and
utterly non-portable by necessity no matter what you do.  OpenBSD does not use
it anyway, but that decision does not help the projects that do use it.


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




Re: groff in openSUSE

2023-08-25 Thread Ingo Schwarze
Hi Michael,

Alejandro Colomar wrote on Fri, Aug 25, 2023 at 02:56:15PM +0200:
> On 2023-08-25 14:05, Michael Vetter wrote:

>> # conflicts with mandoc
>> mkdir man7mp
>> mv man7/man.7 man7mp/man.7mp

Creating a subdirectory for the Linux Man Pages Project sounds
like a bad idea to me.  I would expect that to confuse users
because the Linux Man Pages Project is arguably the most important
documentation on Linux, so relegating even part of that to a
subdirectory feels at least surprising.

> Since you use man-db as your primary man(1) implementation --or I
> thought you do--, having man(7) provided by mandoc makes no sense.
> You should have groff's groff_man(7) be your man(7) --maybe via a
> link page (.so), or via a symlink--.

As the upstream maintainer of the mandoc(1) toolkit, i concur.

The man(7) manual page included in the mandoc toolkit documents
how mandoc implements the man(7) language.  That is only useful
for operating systems regarding man(7) as an obsolete historical
language, which isn't the case for Linux because Linux still
sticks to the 1979 tradition of using man(7) for most manual pages
and even supports writing new manual pages in man(7) - even though
i understand that Alejandro is now also willing to accept new
manual pages written in the younger mdoc(7) language which is more
powerful, easier to write, and less susceptible to compatibility
problems nowadays.

If you want to install the mandoc version of the man(7) manual page
on openSUSE - which may occasionally be useful for very experienced
manual page users on openSUSE who wonder what exactly mandoc implements -,
you should probably pick a different name like mandoc_man(7) by
assigning to the MANM_MAN configuration variable in configure.local
as documented in configure.local.example, for example by adding a
line similar to this to configure.local:

  # Some distributions may want to avoid naming conflicts among manuals.
  # If you want to change the names of installed section 7 manual pages,
  # the following alternative names are suggested.
  # The suffix ".7" will automatically be appended.
  # It is possible to set only one or a few of these variables,
  # there is no need to copy the whole block.

  MANM_MAN="mandoc_man"   # default is "man"

Yours,
  Ingo

-- 
Ingo Schwarze 
http://www.openbsd.org/   
http://mandoc.bsd.lv/ 



[bug #64594] [troff] "warning: cannot select font 'C'"

2023-08-24 Thread Ingo Schwarze
Follow-up Comment #2, bug #64594 (project groff):

Let me provide a shorter answer to supplement what gbranden@ said:

With that kind of low-quality non-portable input poking deeply into
implementation details of the formatter instead of using man(7) macros like a
manual page should, it's not surprising that you run into lots of
formatter-dependent misformatting and into lots of compatibility issues - both
going from one formatter to another and going from one release to the next of
the same formatter.

I have provided more detail here:
https://github.com/jgm/pandoc/issues/9020#issuecomment-1692680356


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




Re: [PATCH] [grotty]: Use terminfo.

2023-08-23 Thread Ingo Schwarze
Hi Lennart,

Lennart Jablonka wrote on Tue, Aug 22, 2023 at 05:56:51PM +:
> Quoth Ingo Schwarze:
>> Lennart Jablonka wrote:
>>> G. Branden Robinson wrote:

>>>> Until it does, read "terminfo(3)" as "putp(3)".  It's brings
>>>> up the relevant page and is quicker to type anyway.

>> Good advice.

>>> That conflicts with my sense of aesthetic.   References should use
>>> the man page title.   The correct way to refer to putp’s man page
>>> is something like “Print stuff using putp (see terminfo(3)).”

>> Absolutely not.  It is very widespread practice to refer to a page
>> by any of its names that can be found by man(1).  Best practice,
>> in any particular instance of a reference, is to use the name that
>> is most logical.  So pointing to putp(3) is actually better in the
>> example you are constructing above because this makes more sense to the
>> reader and is easier to understand.  The user has no reason to care
>> which other function may or may no be documented in the same page.
>> Besides, that is likely an operating-system specific implementation
>> detail and often even changes over time within an operating system.
>> I have personally merged and split manual pages many times in the past,
>> and it would be a maintenance nightmare if performing such a merge or
>> split would require changing all cross-references to the manual pages
>> involved in the split or merge.  On Linux, the inconvenience would
>> be even more dire than on BSD systems because you would have to deal
>> with many different package maintainers, some of whom may be slow to
>> respond, let alone roll a new release for their package, so in Linux,
>> your policy would render manual page maintenance next to unworkable.
 
> I recognize the burden it would be on the maintainers.   I do feel like 
> you’re exaggerating:  How often does it change which man page a function 
> or whatever is documented in?

In OpenBSD, i'd estimate such cases arise roughly every few weeks
on average.  Since OpenBSD is a relatively compact system focussing
on stability and simplicity, i'd expect the rate being larger in more
modularized, larger, and more feature-hungry systems.

Besides, while the work involved in maintaining cross references among
manual pages does not constitute the majority of work when maintaining
manual pages, it is not insignificant either, in particular when it
comes to larger libraries like OpenSSL or LibreSSL.  Even with the
current, simple scheme, i have spent many days of work on improving
cross references, and i have watched others do similar work.

> But then, I don’t work on man pages nearly as much as you do.

>> Besides, "using putp (see terminfo(3))" looks ugly, is unnecessarily
>> wordy, and causes a distraction of the reader for not benefit
>> whatsoever.
 
> Oh, but there is a benefit:  If you print your collection of man pages, 
> calling it something like “OpenBSD Programmer’s Manual,” the man pages 
> will be sorted by their titles.   If done well, you’ll also have a KWIC 
> index of the NAME lines, but it’s still easier to find the page you’re 
> looking for if you have the title directly.

In particular when you care about locating information quickly and
easily, putting it on dead trees is a very bad idea.  KWIC was a
great idea when invented more than 150 years ago and still quite
useful when used for the AT UNIX manuals in the 1970ies (because
they didn't have CRT displays at Bell Labs at first).  Even though i
did have a monochrome CRT when using HP200 machines in the 1980ies,
i still used printed documentation because those machines had no hard
disks and the floppies they used weren't even large enough to hold
the complete operation system, let alone documentation, and swapping
floppies and waiting for them to be read would have been awkward.
But i'd say printed software documentation became obsolete roughly
around 1990.

Nowdays, i have a hard time taking KWIC seriously even as a crude
workaround for a lack of semantic search and regular expression
capabilities.

> That’s not much of a concern today.   Few projects today of those that 
> have many man pages collect them in one work.   Groff does;  man-pages 
> does.   Stanley Lieber provides printed man pages for 9front.¹
> ¹ https://9front.org/propaganda/books

That looks like a work of love, aesthetic enjoyment, and nostalgia
to me.  But making the text harder to read for users reading manuals
at the command line or on the web and harder to maintain at the same
time, to marginally help searchability of the printed version, which
remains atrocious by today's standards either way - what a bad tradeoff...

> An OpenBSD Programmer’s Manual?   I’d like to see that, but I doubt
> anyo

Re: [PATCH] [grotty]: Use terminfo.

2023-08-23 Thread Ingo Schwarze
Hi Alexis,

Alexis wrote on Wed, Aug 23, 2023 at 09:02:36AM +1000:
> Ingo Schwarze  writes:

>> The odd one out really is the mandoc implementation of man(1)
>> which does use header and NAME section names for lookup.
>> It is used by default in OpenBSD, Alpine Linux, Void Linux,
>> and Termux, and users can optionally switch to it in Arch Linux,
>> openSUSE, and Fedora.

> On Gentoo as well (and i switched to it myself).

Oh wow.  The option has already been provided since Dec 2020 and
i totally missed it.  It appears you have a large team working on
maintaining system-man and the related ebuilds including man-db, groff,
and mandoc.  Thank you, that seems quite useful.  Getting real-world
usage in diverse environments is beneficial to keep up the stability
and reliability of a software packages.  And i believe Gentoo is still
significantly different from many other major Linux distributions in
a number of ways.

So i just updated these with respect to Gentoo,
better late than never:
  https://mandoc.bsd.lv/ports.html
  https://mandoc.bsd.lv/porthistory.html

Yours,
  Ingo



Re: [PATCH] [grotty]: Use terminfo.

2023-08-22 Thread Ingo Schwarze
Hi Lennart,

Lennart Jablonka wrote on Mon, Aug 21, 2023 at 12:45:01AM +:
> Quoth G. Branden Robinson:
>> At 2023-08-19T20:08:06+, Lennart Jablonka wrote:

>> (I observe that ncurses doesn't actually _provide_ a terminfo(3) page,

You can't really say that in general, it appears to be operating system
dependent.  For example, on OpenBSD-current, i get:

  schwarze@isnote $ man -ks 3 terminfo
  termcap, tgetent, tgetflag, tgetnum, tgetstr, tgoto, tputs(3)
- direct curses interface to the terminfo capability database
  terminfo, del_curterm, mvcur, putp, restartterm, set_curterm, setterm,
setupterm, tigetflag, tigetnum, tigetstr, tparm, tputs, vid_attr,
vid_puts, vidattr, vidputs(3) - curses interfaces to terminfo database

The same applies to NetBSD:

  https://man.bsd.lv/NetBSD-9.2/terminfo.3

but apparently not to FreeBSD.  So really, it varies.

>> which is silly.

Actually, it isn't.  Strictly speaking, OpenBSD policy would dictate
deleting the name terminfo(3) from the putp(3) manual page because
there is no such thing as public API function called terminfo()
that a program could call:

  schwarze@isnote $ man -k Fn=terminfo
  man: nothing appropriate
  schwarze@isnote $ objdump -t /usr/lib/libcurses.so.14.0 | grep putp
  0004b8b0 g F .text  001a putp
  schwarze@isnote $ objdump -t /usr/lib/libcurses.so.14.0 | grep terminfo
   l df *ABS*   home_terminfo.c
  0001cb7a l O .rodata 0007 _nc_get_token.terminfo_punct
  000297b0 g F .text  00cf _nc_home_terminfo

>> Or rather, the page is there, but it doesn't bother to mention its own
>> name in its NAME section so that makewhatis/mandb can find it.

While this situation only occurs in a minority of manual pages,
it is not completely out of the ordinary.  There are several pages
with this property.

Also, please be careful about your terminology.  There is no such thing
as "the" name of a manual page.  Many manual pages - likely even the
majority of them - have more than one name, and there are five
different categories of manual page names:

 * file namess - traditional implementations of man(1) search for
   manual page names in the file system, so these are the only names
   they support.  Yet, using hard or symbolic links, a manual page
   can easily have multiple file names, and many operating systems
   and portable software packages use that feature by creating hard
   or soft links to manual pages.
   mandoc(1) doesn't have any such limitation, but recognizes some
   manual page names from the file content in addition to from the
   file names.
 * header names, i.e. those contained in .TH or .Dt macros.
   There is always exactly one header name per page.  Even if you wanted
   to define "the" name of a manual page, this would be by far the worst
   choice because traditionally, it is all caps, so it traditionally
   provides incomplete information.  We are currently in the process
   of deprecating putting all caps here, but that process is going to
   take a long time to complete.
   Traditional implementations of man(1) do not take header names into
   account when locating man pages.  mandoc(1), however, does.
 * NAME section names - traditional implementations of man(1) do not
   take these into account when locating man pages.  mandoc(1),
   however, does.  Many pages have several such names.
 * the first NAME section name, which sometimes is considered more
   important than the other NAME section names, but it's not really
   "the" name of a manual page either.  In a typical library
   development lifecycle, library developers introduce a function,
   use it for years, then realize the design is defincient and can be
   improved upon, so they invent another, related function and document
   it in the exiting manual page.  Again years later, the original
   function may become deprecated, but remains documented because some
   existing software still uses it.  So you end up with a manual page
   where the first NAME section name is deprecated.
   This is not at all uncommon.
 * synopsis names, i.e. names appearing in the SYNOPSIS, for example
   using the mdoc(7) .Nm or .Fn or .Fo macros.  Even though man(1)
   does not take these into account when locating manual pages, at
   least not traditionally and not when run with default options,
   these are undoubtedly names because the mandoc implementation
   of apropos(1) can search for them using the "Nm" search directive.

>> Guess I'll be sending a patch.)

I'm not sure what the policies of the specific upstream project you
are considering to send a patch to are, but please take the above
into account and do not confuse them by using misleading language
in the rationale that you provide to them with your patch.

> Wait, man-db does’t use a man page’s title to look it up at all?   
> That seems bad.

Well, as far as i am aware, it uses all filenames of the page
for loookup, as *all* 

Re: [PATCH] [grotty]: Use terminfo.

2023-08-20 Thread Ingo Schwarze
Hi Branden,

i did not spend the time yet to understand what this discussion is all
about, and it seems to have very low priority for me, at the same time
as there are lots of moderate to high priority tasks open for me -
including, for example, support for lower-case .TH/.Dt and .SH/.Sh and
for .MR in mandoc plus various other topics, so i'm not likely to look
at the discussion this mail belongs to soon - oh wait, actually, i might
have to investigate what is going on here before updating the OpenBSD
port of groff because if it is dangerous, i might have to hardcode -c in
our groff port to protect our users from vulnerabilities.

I'm not saying that will be needed - but merely that i did not
investigate yet and that i have a bad feeling about it.

G. Branden Robinson wrote on Sun, Aug 20, 2023 at 01:52:14AM -0500:

> I guess I need to understand more about the purported hazards of `-r`.

Fortunately, that part is trivial.

Essentially, using -r for manual page display amounts to enabling remote
exploits.  In some circumstances, it may allow manual page authors to
run arbitrary code as your user ID on your machine.  In many
circumstances, it will cause reliability issues, i.e. remote attackers
can change the way your terminal shows output (hiding information from
you or inserting bogus information) or interprets what you type
(potentially changing the effect of commands you are typing into the
terminal).  It certainly allows remote DOS attacks, i.e. making your
terminal unusable.  If you ever look at manual pages as root -
admittedly, i am quite careful to never do that on sytems running man-db
or groff for manual page display, but i occasionally do it on systems
running mandoc, and i guess many Linux users will fail to be careful
about avoiding to run man(1) as root - all of the above may turn into
remote root exploits.

In a nutshell, less -r must NEVER be run on untrusted input, and manual
pages are a prime example of untrusted input.  I mean, have you ever
heard about anybody performing a security audit on manual page source
code, to find out whether the manual pages in question contain any
malicious code?  If the answer is "no", or even if it is "well, i assume
there may be at least some manual pages on my system that have not been
audited for security by people i trust", then you have to treat manual
pages as potentially malicious input.

As a matter of fact, i even avoid using less -R for manual page display
for security reasons.  While admittedly, the -R option has been designed
such that it ought to be safe, that is only true as long as the specific
terminal emulator being used doesn't contain bugs that mix up escape
sequences.  As a software developer, i occasionally test non-standard
terminal emulators, and then i don't want to have to remember changing
my PAGER environment variable, so i prefer playing PAGER safely in the
first place.

Yours,
  Ingo



Re: Proposed: change `pm` request argument semantics (was: process man(7) (or any other package of macros) without typesetting)

2023-08-17 Thread Ingo Schwarze
Hi,

G. Branden Robinson wrote on Thu, Aug 17, 2023 at 06:44:14PM -0500:
> At 2023-08-17T21:12:35+0200, Alejandro Colomar wrote:

>> The problem is that at no point you can have the .roff source, after
>> the man(7) macros have been expanded.  Would it be possible to split
>> the groff(1) pipeline to have one more preprocessor, let's call it
>> woman(1) (because man(1) is already taken), so that it translates
>> man(7) to roff(7)?

> In other words, you want to see what a *roff document looks like after
> all macro expansions have been (recursively) performed.
> 
> I wanted this, too, back in 2017 when I first started working on groff.
> 
> The short answer is "no".
> 
> The longer answer is that this is hard because GNU troff, like AT
> troff, never builds a complete syntax tree for the document the way
> "modern" document formatters do.  nroff and troff were written and
> deployed on DEC PDP-11 machines that are today considered embedded
> microcontroller environments.  Therefore they handled as little input at
> one time as possible.  Roughly, this meant that input was collected,
> macro-expanded as soon as it was seen, and then as soon as it was time
> to break an output line, a lot of formatter state related to parsing was
> flushed, and it started reading input again.
> 
> Understanding *roff a little better 6 years later, I can more easily
> imagine ways to run AT troff out of memory on a PDP-11.  Ultra-long
> diversions would be one way,[1]
> [1] Nobody _except_ mandoc(1) seems to handle this well.  Credit where
> it's due.  https://savannah.gnu.org/bugs/?64229

Praise is usually nice to have, but i must admit this particular praise
surprises me on more than one level.  :-)

https://man.openbsd.org/roff.7#di says:

  di divname
Begin a diversion. Currently unsupported.   [by mandoc(1)]

I'm not completely convinced not supporting a particular request
at all amounts to "handling it well".

Besides,

   $ time { printf '.di foo\n.nf\n'; yes abcdefghijklm; } | mandoc
  mandoc: Cannot allocate memory
0m07.61s real 0m05.67s user 0m01.81s system

i.e. infinite input crashes mandoc - admittedly via err(3) after
malloc(3) returns NULL, which is relatively controlled, but
still a crash.

But GNU troff isn't actually *that* much worse:

   $ time { printf '.di foo\n.nf\n'; yes abcdefghijklm; } | troff
  Abort trap (core dumped) 
0m24.72s real 0m04.43s user 0m03.82s system

with this backtrace:

  _libc_abort at /usr/src/lib/libc/stdlib/abort.c:51
  abort_message at .../llvm/libcxxabi/src/abort_message.cpp:78
  demangling_terminate_handler at .../libcxxabi/src/cxa_default_handlers.cpp:66
  std::__terminate at .../llvm/libcxxabi/src/cxa_handlers.cpp:59
  __cxxabiv1::failed_throw (exception_header=0x61079458300)
at .../llvm/libcxxabi/src/cxa_exception.cpp:152
  __cxa_throw (thrown_object=0x61079458380, 
tinfo=0x61003540200 , 
dest=0x6100353a340 )
at .../llvm/libcxxabi/src/cxa_exception.cpp:283
  operator new at .../llvm/libcxx/src/new.cpp:76

Exiting via abort(3) is also a relatively contolled way of dying.
Arguably it's a bit less clean here in troff than in mandoc
because signals are involved, and Unix signals are among the
worst parts of the C and POSIX programming environment and should
be avoided whenever possible, since they are generally fragile
and often invite vulnerabilities.  But in this case, this is not
the fault of GNU troff.  This downside merely follows from the
choice of the implementation language C++, which suffers from
ill-designed, very messy error handling in general.

I'm not sure why you see a SIGKILL getting thrown at the troff process
on your machine - but i *suspect* that may have nothing to do with GNU
troff either and may be an implementation detail of whatever operating
system, C++ compiler, and C++ standard library you are using.  Sure, on
first sight, an explicit abort(3) being called on the C library level
*might* look slightly safer than SIGKILL flying around - then again,
i'm not really sure it makes a difference.  Whether that actually
is a security risks depends on many details you did not disclose.
Quite possible it isn't.

> because formatted diversion contents
> have to be kept in memory until they're called for.  A multiplicity of
> moderately sizes diversions would do it too.  Conditional blocks would
> be another problem.  When encountering a brace escape sequence \{, the
> formatter has to scan ahead in the input.  Or at least GNU troff does.
> Maybe AT troff did something clever, but its source code is famously
> opaque.
> 
> I'll say it before Ingo does: mandoc(1) (as I understand it) _does_
> build a syntax tree for the entire document before producing output,
> which enables some of the nice features that it has.

Correct.

However, before Alejandro gets carried away with enthusiasm, let
me emphasize that is does the opposite of what Alejandro is asking
for: He wants all the man(7) macros converted to roff(7) 

Re: [PATCH v2] man*/: ffix (migrate to `MR`)

2023-08-16 Thread Ingo Schwarze
Hi Branden,

G. Branden Robinson wrote on Tue, Aug 15, 2023 at 10:55:22PM -0500:

> If the page is withdrawn, I expect distributors will need to manage the
> man.7 page using Debian's "alternatives" mechanism or similar; if
> groff_man.7 is installed, man.7 should be a symlink to it.  If
> mandoc_man.7 is installed, likewise.  If both are installed, the
> distributor needs to select a default preference.

I think that most general-purpose Linux distributions are probably
better off simply prefering groff_man.7 over the man.7 bundled with
mandoc.  As you know, the mandoc distribution regards the man(7)
language as a legacy language that is obsolete since about 1990 -
which makes sense for all operating systems based on BSD and Illumos.
Consequently, the mandoc man(7) page only provides bare-bone information
focussing on questions of compatibily and no advice whatsoever
for people intending to write new manual pages.

That perspective is not really helpful for general purpose Linux
distributions: for these, the Linux man-pages project matters a lot,
and that project is not considering the man(7) language as obsolete at
all.  That i keep recommending changing that stance does not appear to
have much effect so far and isn't relevant for the questions at hand.
Either war, a disagreement regarding the merits of some policy is
not a good reason to deprive users of information they might require.

> I expect you will want to emphasize this in the release announcement,
> when the time comes.

In the Linux man-pages project release announcement, i recommend
simply saying that groff_man(7) replaces the former man(7) that used
to be bundled in the Linux man-pages project.  For the purposes of
the Linux man-pages project, the man(7) page distributed with mandoc
isn't useful, so no need to confuse the users of the Linux man-pages
project by talking about it.

> This already needs to happen with soelim(1) and roff(7),

I'm not convinced.  The soelim(1) bundled with mandoc is
really only a stopgap implementation in case people need something
quickly but don't have a real ROFF system around.  I don't think
it sees much use at all.

For the roff(7) manual page bundled with mandoc, the same applies
as for man(7), only more strongly so.  The mandoc roff(7) page is
totally inadequate for learning the roff language.  It merely
intends to document the subset of roff(7) relevant for manual
page authors and maintainers, and the subset supported by mandoc.

> but it doesn't, exactly; Debian renames mandoc's versions of the
> former to msoelim(1) and the latter to mandoc_roff(1).

That makes sense to me, more than using "alternatives" for these two
would, in the case of Debian.  I'm not sure installing the mandoc
soelim program on Debian is even useful in the first place.  If you
really need soelim(1) on Debian, you almost certainly already have
groff(1) installed, or at least you ought to install it.

> Termux simply throws groff's versions
> away and installs mandoc's versions as soelim(1) and roff(7).

That also makes some sense to me.  In Termux, the complete
subsystem for searching, displaying, and viewing manual pages has
been exclusively based on mandoc since 2015.  So while Termux is, in
most aspects, more similar to a Linux distro than to a BSD system,
regarding documentation, it is essentially a BSD and not a Linux
system, so it totally makes sense to follow BSD conventions regarding
manual page naming, installation, and manual page tools.

Besides, it is a system for relatively small devices, and consequently,
the decision to use BSD tools for documentation actually makes
sense because the mandoc toolchain is significantly less resource-
hungry than the groff toolchain.  Of course, mandoc is inadequate
for general-purpose typographic and publishing work - but who in
their right mind would want to write their books and journal
articles on Termux anyway?

A small number of other Linux systems exist where similar arguments
apply, most notably Alpine Linux and Void Linux.  Alpine is heavily
geared towards very small hardware.  Void is among the Linux distros
closest to BSD in philosophy.  Both have been using mandoc exclusively
for even longer than Termux.

But even for Linux distros officially supporting that users switch
their manual page search and display system from man-db+groff to
mandoc if they want to - last time i looked, that included Arch,
openSUSE, and Fedora - installing the mandoc roff(7) as roff(7)
would seem like a bad idea to me.

> I also use Termux.  Imagine my surprise when I upgraded to groff 1.23.0
> on my tablet and brought up roff(7).  I was expecting to see myself in
> the mirror, and what should greet me but the visage of Ingo Schwarze!
> 
> Unnerving, no?

Heh, buhuuu!  

Yours,
  Ingo



[bug #64502] [mdoc] macros should inhibit word breaking where sensible

2023-08-08 Thread Ingo Schwarze
Follow-up Comment #3, bug #64502 (project groff):

FYI: how mandoc(1) does this.

1) mandoc(1) never hyphenates anything - obviously, groff(1) does not want to
emulate *this* aspect.
2) Even at existing hyphens, mandoc(1) only breaks output lines if the word
containing the hyphen is on a text input line, never if it is on a macro or
request input line.
3) In mandoc, rule 2) applies to both mdoc(7) and man(7).

That may sound unbelievably simple to people who previously pondered ploughing
through macro lexica and making tons of individual decisions.
But isn't it beneficial to keep things simple when you have the chance?

Besides, i also think rule 2) actually makes sense.  In manual pages, the vast
majory of words that appear on macro lines are some kinds of syntax elements,
and splitting these across line breaks, even at existing hyphens, is certainly
not needed, but may often look weird, no matter whether that happens in the
SYNOPSIS or anywhere else in a manual page.  For example, when listing links
to other manual pages below SEE ALSO, and some of those manual page names
contain hyphens, wouldn't it be better to keep each page name on one line
anyway?  Or even when the same thing happens in the middle of running text?

Admittedly, manual page writers may occasionally use .Em or .I for stress
emphasis in running text, and arguably, in that case hyphenation would be OK. 
But i think that's rare in manual pages compared to using macros for more
technical purposes, and having such an emphasized word hyphenated is certainly
not important.  Sometimes, it may not even look so good to have a few italic
characters at the end of one output line and then a few more at the beginning
of the next.


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #63958] [mdoc] decide how to set up hanging indent in Synopsis sections

2023-08-08 Thread Ingo Schwarze
Follow-up Comment #18, bug #63958 (project groff):

To re-iterate, right now, i don't see a need to change anything with respect
to the topic of this ticket.

Unless i'm unaware of some arcane corner cases, i modeled the hanging indents
in mandoc(1) to match how groff(1) does them.

Ragarding the variable-width indents in sections 1 & 8, i don't recall ever
being unhappy with them.  That may be because the names of command line
commands tend to be short.  Admittedly, in OpenBSD, Marc Espie blessed us with
a check-lib-depends(1) manual page, but even that still looks passable, and it
doesn't get much longer than that.  The dbus-update-activation-environment(1)
manual arguably does look ugly, but such insanity is rare to the point that i
would call it irrelevant for all practical purposes.

On the other hand, long function names in section 3 are *not* unusual, so with
large variable indents, many pages in section 3 would look really bad.  On top
of that, many section 3 pages document more than one function, and having a
differently wide hanging indent for every function would also be ugly.  So the
currently solution of always using 4n is just fine.




___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #64018] [man, mdoc] decide on a common base paragraph indentation

2023-08-07 Thread Ingo Schwarze
Follow-up Comment #22, bug #64018 (project groff):

[comment #15 comment #15:]
> > > > We cannot, obviously, have three-letter requests.
> > > Nope.  Like I said, there's room for `Cq`, `Co`, and `Cc`.
> > 
> > Indeed, I see only Co used grepping through all tmacs:
> > tmac.doc.old has it as macro (just .tm’ing to say it’s
> > not an mdoc macro) plus…
> 
> It's in _groff_'s "doc-old.tmac", too, which has the same origin.
> 
> Huh.  I wonder what the story behind that is.

Some years ago, i talked to Cynthia, i think Kirk McKusick helped establish
the contact back then.

What groff calls "doc-old.tmac" is what Cynthia used to call "tmac version 2"
around 1990, whereas she called the language we are now used to "tmac version
3" back then.  If i remember correctly, even Cynthia herself did not keep a
copy of "tmac version 1", and i think she considered it unlikely that such a
copy exists anywhere.  But it's also next to irrelevant at this point.  As
people usually do when designing a new language, she experimented a lot with
preliminary ideas during the early stages and took stuff out again when she
had better ideas and/or collected experience using her new language for
practical work.

Even version 2 was a pre-alpha thing and never used consistently, not even for
any alpha release of the BSD system.  It was used in a relatively small
minority of manual pages in 4.3BSD-Reno (June 1990), which you might call an
official Beta release; the name was intended to indicate "running this is akin
to visiting a Casino."  The vast majority of manual pages in Reno still used
man(7) because the design of the mdoc(7) language was nowhere near finished
and the main work of rewriting the documentation under a free license had
barely started.

Consequently, what mdoc version 2 did or did not do is of very limited
interest even to extreme BSD history geeks.  Compatibility with mdoc version 2
is completely irrelevant for any purpose one could possibly think of because
version 2 was never considered ready for production in the first place.  There
certainly aren't *any* mdoc version 2 documents that anybody uses for any
contemporary purpose in 2023.  Even finding purely historical mdoc version 2
documents that you could use a version 2 formatter on is not all that easy, in
particular not outside McKusick's BSD history CDs.

I suspect that *maybe* Cynthia poisoned .Co in mdoc version 2 because in
4.2BSD, 4.3BSD-Reno, and 4.4BSD, the documentation of the "MH" email handling
system, written in Eric Raymond's -me macros, defined its own .Co macro and
used it at quite a few places - but i'm not sure that's the reason.  It
certainly no longer matters today.

> > | mdoc/README:.\" NS Co register (site) Width Needed for Column offset
> > 
> > … I’m not sure if this is still true, given my grep
> > did not find any other occurrence? I think this is
> > old/wrong and needs to be removed.
> 
> It seems likely to me.

I failed to unearth evidence for \n(Co ever being used for anything in any of
my 4.2BSD, 4.3BSD, or 4.4BSD trees.
It seems that line in Cythia's README file refers to one of her experimental
ideas that she discarded before it ever grew up sufficiently for production
use.

By the way, do we still have that ancient README file in the groff git tree?
I don't readily see it in git, and keeping it would almost certainly be a
mistake.

I think that OpenBSD still has it lying around in the source tree in
/usr/src/share/tmac/mdoc/README is a mistake, too.


> I would guess that Ingo has the world's biggest corpus of _mdoc_(7)
documents readily at hand--but perhaps not the time to grep them for our
benefit.  :P

I tried to figure out why you consider `Cq`, `Co`, and `Cc` only to fail -
still scratching my head...

All the same, FWIW, i just grep'ed the manual pages in the base systems of

OpenBSD-current
FreeBSD-13.0
NetBSD-9.2
DragonFly-3.8.2
4.4BSD-Lite2  (some of these are old, but that may not matter for the purpose
at hand)

and came out completely empty-handed for '^\. *\', i.e. .Co as a line
macro.

Obviously, .Co as a sub-macro is harder to grep for, but '^\..* Co ' and
'^\..* Co$' did not find anything, either.

And Co as a register name?  Well-written mdoc(7) manuals should not expand
registers, but i looked anyway using '\\n.Co': again, nothing.

Maybe a string register name then?  I tried '\\\*.Co': again, only false
positives.

So it doesn't appear anything anywhere is using "Co" as a roff(7) identifier
in mdoc(7) manual pages.

Then again, please be aware that i never attempted to build a repository of
mdoc(7) manual pages as comprehensive as possible, i merely have a number of
BSD trees lying around.  I did not check Illumos nor some stand-alone portable
software projects that are using mdoc(7).  Then again, it would be seriously
ill-adwised for Illumos or any such projects to introduce any completely new
features, at least not without carefully coordinating with both groff and
mandoc.



[bug #64018] [man, mdoc] decide on a common base paragraph indentation

2023-08-07 Thread Ingo Schwarze
Follow-up Comment #21, bug #64018 (project groff):

Regarding editing manual page source code in such a way as to avoid
particularly ugly line breaks in standard-width 80 column terminal windows:

[comment #12 comment #12:]
> Possibly, _mdoc_(7) page authors knew this and carefully edited the ones
that did, so that now no one sees them.
> But they would have be de-semanticizing their inputs by sweating formatting
details.
> Perhaps Ingo will join me in finding that dubious,

In OpenBSD, people *mostly* avoid such hand-optimization of presentational
markup and stick to semantic markup.  In the vast majority of cases, that
yields good results and minimizes maintenance effort.

However, a smaller number of pages exists that, for one reason or another, are
hard to get into a state both easy to read and looking pleasant, if you purely
stick to presentational markup.  It typically  happens with content that is
more complicated and harder to understand in the first place.  These cases are
not typically as simple and straightforward as SYNOPSIS sections; we tend to
stick to semantic markup in the SYNOPSIS.  The auto-breaking features of .Op,
.Fl, and .Ar tend to work reliably in general.

I know for sure that in such cases, our chief documentation maintainer, Jason
McIntyre, occasionally does resort to manually optimizing the source code such
that output lines do not exceed 80 columns, do not break in bad places places,
and so on.  Now if we would suddenly increase the global indentation by 2n,
most of these hand-optimized cases would suddenly become hard to read and ugly
in precisely the way Jason spent some work on avoiding.  That would be bad
because, as i said, these are not just random cases, but typically cases with
complicated content, where causing an additional distraction for the readers
would be particularly unfortunate.

For that reason, i'm definitely not going to increase the global offset from
5n to 7n in mandoc(1).  Even if you were to do that in groff(1), mandoc would
certainly not follow, and i might possibly even patch it back in the OpenBSD
port of groff.

I'm not quite sure how this kind of hand-optimization is regarded in FreeBSD
and NetBSD.  I talked to both Warren Block and Thomas Klausner multiple times
face-to-face, but don't recall ever bringing up this particular topic.  I
guess their view might be somewhat similar, but i'm not completely sure.  It
seems likely to me their approach might be somewhat less systematic and more
ad-hoc than in OpenBSD.  So i cannot exclude that hand-optimization might be
slightly more common and semantic markup slightly weaker on average in their
pages than in ours.  In general, they tend to invest less into documentation
than we do.  I know even less about DragonflyBSD, except that they usually
follow FreeBSD quite closely (even though often with significant time delays)
unless they are specifically working on an area of their system, so i doubt
they are doing much general-purpose manual page work in the first place.

> even if he hates the changed indentation (which I aim to change back, and
port over to _groff man_(7), in case that wasn't clear).

I felt relieved when earlier comments in this ticket made this seem likely to
me.  Thank you!


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




Re: Warn on mid-input line sentence endings

2023-05-02 Thread Ingo Schwarze
Hi,

Alejandro Colomar wrote on Wed, May 03, 2023 at 02:35:41AM +0200:

> Heh!
> Branden wasn't enthusiastic my emails when I wrote poetry in them, though :/
> Any chance we can warn users that they should write poems, not prose?
[...]
> Just kidding, but technically, it's probably more accurate, and more fun.

For demonstration purposes, i present a short poem, hoping you like it:

  Pesky poets badly
  Breakit.  Poems permit! 

My point is that excessively smart ideas like "write poems, not prose"
make nothing clear, the times when poetry followed strict rules lie
thousands of years in the past.  When designing technical terms for
use in technical documentation, aim for clarity and simplicity and
refrain from trying to seem witty or funny.

Sure, making the documentation pleasant to read is fine as a secondary
goal as long at that doesn't harm the primary goals - correctness,
clarity, completeness, conciseness, and systematic organization.

Even the occasional joke, applied sparingly, has its place:

   $ man strftime | grep -A 1 ^B   
  BUGS
 There is no conversion specification for the phase of the moon.

But such is indeed the place: near the bottom of the page, where
it doesn't get in the way of the many people who need to quickly
find the relevant information while focussing on something else,
namely on coding or on using some program in a complicated way.
We should certainly refrain from joking while defining technical terms.

Yours,
  Ingo



Re: Prefix for warnings (was: device-dependent warnings)

2023-05-02 Thread Ingo Schwarze
Hi Alejandro,

Alejandro Colomar wrote on Wed, May 03, 2023 at 01:26:55AM +0200:

> I find it more readable when there's one space between the program
> that generates the warning and the file.  That's what mandoc(1) does,
> and in general, what any program that relies on perror(3) does (I'm
> assuming mandoc(1) probably calls perror(3) or similar).

Mandoc does not use perror(3); the code in question is in the
function mandoc_msg(3) in this file:

  https://cvsweb.openbsd.org/src/usr.bin/mandoc/mandoc_msg.c?rev=HEAD

That is, the message output function has been polished to be as
simple and straightforward as possible, but without hardcoding
stderr.  In -T lint mode, the messages go to stdout instead of
stderr because in that mode, the messages are considered to be
the desired output of the program and not error messages.

> mandoc: man8/unitd.8:6:2: WARNING: first section is not "NAME": Sh Name
> troff:man3/unlocked_stdio.3:123: warning [p 2, 1.8i, div '3tbd1,0',
> 0.3i]: cannot break line

And:

   $ cat oops.txt  
 cat: oops.txt: No such file or directory

So i certainly agree that the space character makes the message
slightly easier to read and makes it look slightly more familiar.

Then again, happy bikeshedding!
  Ingo



Re: Warn on mid-input line sentence endings

2023-04-30 Thread Ingo Schwarze
Hi Branden,

G. Branden Robinson wrote on Sun, Apr 30, 2023 at 07:34:57AM -0500:

> Hmm, I see that was Bjarni's doing.  Being from Iceland, he perhaps has
> more of the spirit of Loki than most...

Please do not jump to conclusions.  I know at least one Icelander
personally and he is a very pleasant and intelligent guy.

The problem with Bjarni causing confusing and wasting our time over
and over again lies with Bjarni personally and Bjarni alone, not
in any way with Icelanders.

I realize that you likely intended the above statement as a joke,
and fair enough in that case.  As a German, and i can live with
being labeled as not readily understanding British humour.  ;-)

But i think we should be crystal clear about such matters in public
in order to not cause misleading impressions on casual bystanders.


Regarding the naming bikeshed this arose from, i still like
the wording "new sentence, new line" that mandoc(1) adopted
because Jason McTntyre keeps saying that.  It is concise, simple,
and instructive.  But i don't feel strongly about how the warning
is called.  The simpler, the better.

[...]
> I'm strident on this point because I'm opposed to putting a diagnostic
> into the formatter that throws false positives.  That would disserve
> users.

That is very laudable: avoiding false positives as far as possible is
a very good idea, even though it's not usually possible to bring the
rate of false positives down to strictly zero, and there is almost
always a tradeoff between accepting a small number of false positives
or not having a useful warning at all.

For error messages, false positives are quite bad and should be
avoided almost completely if possible.  For style messages, a small
number of false positives are often unavoidable, but minimizing them
is still worthwhile.  "New sentence, new line" is an excellent example
of a style warning where a good perser can do a good job to keep the
rate of false positives low, whereas an ad-hoc partial parser in some
scripting language will almost certainly cause more noise.  All the
same, this is also an excellent example of a warning where even a
very good parser will hardly bring the rate down to zero, and where
accepting the very small residual rate makes sense in order to have
such a quite useful warning.

Yours,
  Ingo



Re: Lowercase in section names

2023-04-28 Thread Ingo Schwarze
Hi Alejandro,

Alejandro Colomar wrote on Fri, Apr 28, 2023 at 04:57:49PM +0200:

> I got a lot of warnings due to using lowercase in section headings.
> I remember we agreed to transition to true case in the title, but
> don't remember if we reached the same agreement in section headings.
> 
> Do you agree to not use uppercase in section headings unnecessarily?
> The rationale is more or less the same as in the title: not discard
> information.

Plus better accessibility for people using screen readers.

>  We already have bold for giving it more importance.
> 
> Anyway, here goes my little complain:
> 
> mandoc: man8/unitd.8:6:2: WARNING: first section is not "NAME": Sh Name
> mandoc: man8/unitd.8:8:2: WARNING: description line outside NAME section: Nd
> mandoc: man8/unitd.8:9:2: WARNING: first section is not "NAME": Sh Synopsis
> mandoc: man8/unitd.8:21:2: WARNING: first section is not "NAME": Sh 
> Description
> mandoc: man8/unitd.8:28:2: WARNING: first section is not "NAME": Sh Options
> mandoc: man8/unitd.8:55:2: WARNING: first section is not "NAME": Sh Exit 
> status
> mandoc: man8/unitd.8:57:2: WARNING: first section is not "NAME": Sh Files
> mandoc: man8/unitd.8:64:2: WARNING: first section is not "NAME": Sh Sockets
> mandoc: man8/unitd.8:69:2: WARNING: first section is not "NAME": Sh Copyright
> mandoc: man8/unitd.8:73:2: WARNING: first section is not "NAME": Sh See also
> 
> Okay, I can understand the first two, and I can easily grep them away,

I certainly intend to add support for mixed-case section names to mandoc.
The work merely isn't done yet.

> but does mandoc(1) really need to remind me about that minor
> thing at every section?  :P

No, that looks like a bug to me.  Thanks for reporting!

Yours,
  Ingo



[bug #64018] [man, mdoc] decide on a common base paragraph indentation

2023-04-09 Thread Ingo Schwarze
Follow-up Comment #3, bug #64018 (project groff):

I see no reason why the standard indentation of running text from the left
margin of the paper and the default indentation inside a tagged paragraph need
to be related in any way.  In fact, in 1990 Kernighan troff, the former was
\n(IN and the latter was \n()I.  Both happened to be .5i, but they could be
contolled independently of each other.

As another example, in mdoc(7), the default indentation inside tagged lists is
8n - 6n for the tag itself and an additional 2n for spacing, exactly as
suggested by Alejandro.

Would anything be wrong with restoring the indentation of running text from
the left margin from 7n to again be 5n but at the same time leaving anything
that happens inside the document content, i.e. to the right of those 7n or 5n
of global indentation, completely unchanged?


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #64018] [man, mdoc] decide on a common base paragraph indentation

2023-04-09 Thread Ingo Schwarze
Follow-up Comment #1, bug #64018 (project groff):

The setting ".nr IN 7.2n" is present since the beginning of the groff
repository, i.e. groff-1.06, Sep 1 12:28:08 1992, file tmac/tmac.an .

The groff-1.01 released by James Clark on Mar 13 12:49:40 1991 and declared as
a "beta-test version" that is contained in 4.3BSD-Net/2 (Aug 20, 1991) also
contains ".nr IN 7.2n".

The 4.3BSD-Reno release (June 1990) does not yet contain groff (possibly
because groff did not yet exist in June 1990, not even as a beta release -
Wikipedia claims the first groff release was 0.3.1 in June 1990, without
specifying a source) and still uses Kernighan's non-free device independent
troff, licensed from AT  It contains ".nr IN .5i" in the man macro set.

Consequently, it is almost certain that it was James Clark who changed from 5n
to 7n during the very early stages of groff development, and definitely
earlier than the 1.01 release.  In the CHANGES and ChangeLog files contained
in groff-1.01, i see no explanation of why he made the change.

I also consider it likely that AT troff never moved away from the 5n
default.  For example, UNIX v10 (1988) contains:

https://minnie.tuhs.org/cgi-bin/utree.pl?file=V10/man/man0/tmac.v10

.if n .nr )M 5n
.nr IN \\n()Mu


I admit the following remains somewhat speculative unless we ask her, but it
seems likely to me that Cynthia used the 5n default for mdoc(7) because she
likely did not yet have access to groff when she started mdoc(7) development:
Reno already contains mdoc(7) macros, but no groff yet.  She once told me that
when she got access to groff, she liked it so much that she proposed to drop
support for Kernighan's troff in manual pages, but that proposal was vetoed by
other members of the CSRG, so the mdoc(7) macros remained compatible with both
roff implementations until the CSRG disbanded in 1995.

To summarize, it was likely James Clark who changed the indentation without so
much as providing a rationale, at least not one that survives to this day,
while it is mdoc(7) that upholds the original UNIX tradition in this respect.


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #63958] [mdoc] decide how to set up hanging indent in Synopsis sections

2023-04-09 Thread Ingo Schwarze
Follow-up Comment #8, bug #63958 (project groff):

re comment #5:

re bug #63046: We both know that i dislike excessive, gratuitious
configurability like that - it add complexity without any benefit, no need to
discuss that over and over again.  I can live with it as long as the defaults
are sane, such that i can hardcode the defaults into mandoc(1).

I call "5n is better" an objective assessment because it does not rely on
personal taste but purely relies on factual arguments:

1. Both macro sets agree that subsection indentation is 3n, so that should not
be changed.

2. Obviously, paragraph indentation must be greater than subsection
indentation.  4n would not be a good choice because that's so close to 3n that
subsection headers would not stand out from the text, because headers and text
would be easy to confuse.

3. Looking at any mdoc(7) page containing subsections shows that 5n is
sufficient to make the two instantly distinguishable.  Consequently, anything
more than 5n is a waste.

I agree that subjective judgement would be required to decide whether any
particular amount of waste would be acceptable to trade for some potential
benefit.  But no one made any claim that 7n provides any benefit, the only
fact mentioned is that's what groff_man(7) - and for compatibility, mandoc(1)
man(7) - currently does.

Your detailed questions all seem irrelevant.  The fact that one implies a
waste and the other does not makes the other better.

I would very much welcome a change from 7n to 5n and would make sure mandoc(1)
immediately follows.


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #63958] [mdoc] decide how to set up hanging indent in Synopsis sections

2023-04-09 Thread Ingo Schwarze
Follow-up Comment #6, bug #63958 (project groff):

re comment #4:

Yes, i believe it must have been a widget-clicking goof on my part, and i
didn't even notice it when the "status" changed.  Sorry for that.

I don't like interactive web interfaces except those that are strictly
read-only, and it seems they don't like me all that much either.  :)


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




Re: Accessibility of man pages

2023-04-08 Thread Ingo Schwarze
Hi Dirk,

Dirk Gouders wrote on Sat, Apr 08, 2023 at 10:59:32PM +0200:
> Ingo Schwarze  writes:
>> Dirk Gouders wrote on Sat, Apr 08, 2023 at 09:48:13PM +0200:

>>> Yes, it's very slow but close to `man -K`:
>>> 
>>> find... man -K...
>>> 
>>> real 107.45 real 96.34
>>> user 117.06 user 70.11
>>> sys 14.43   sys 26.86
>>> 
>>> [a thought later]
>>> 
>>> Oh, I found something much faster:
>>> 
>>> $ time -p find /usr/share/man -type f | xargs bzgrep -l RLIMIT_NOFILE
>>> [snip]
>>> 
>>> real 24.30
>>> user 32.34
>>> sys 6.84
>>> 
>>> Hmm, perhaps, someone has an explanation for this?

>> These are all terribly slow IMHO.
>>
>> For comparison, this happens on my OpenBSD notebook, with more than
>> five hundred optional software packages installed in addition to the
>> complete default installation:
>>
>>$ time man -k any=RLIMIT_NOFILE
>>   dup, dup2, dup3(2) - duplicate an existing file descriptor
>>   getrlimit, setrlimit(2) - control maximum system resource consumption
>>   sudoers(5) - default sudo security policy plugin
>> 0m00.21s real 0m00.00s user 0m00.03s system

> Yes, this is really fast and would allow for quite interesting ways to
> work with manual pages.
> 
> But, OpenBSD's `man -k` operates on a makewhatis(8) database and not
> on every single manual page or am I wrong?

Yes, you are completely correct about that.
The database format is documented here:

  https://man.openbsd.org/mandoc.db.5

And the search syntax here:

  https://man.openbsd.org/apropos.1

The concept works very well because in contrast to man(7), mdoc(7)
provides substatial semantic markup (without being harder to write
or maintain).

The comparison seemed relevant to me because as far as i understood the
intention of the thread, participants were looking for ideas to make
searching for content in manual pages more powerful and more efficient.
The combination of semantic markup and indexing of marked up content
is one way to make progress in that direction, and the combination
of mdoc(7) with mandoc(1) is an example of a system demonstrating
the concept.

I understand people familiar with GNU info(1) pointed out that
providing index entries that do not correspond to marked up
content is also occasionally useful.  I do not completely disagree
with that, and the mdoc(7) language as implemented by mandoc(1)
provides a dedicated macro to do just that:

  https://man.openbsd.org/mdoc.7#Tg

Then again, practical experience shows that manual tagging is needed
only in extremely rare cases and completely automatic tagging produces
completely satisfactory index entries for the vast majority of cases.

Yours,
  Ingo



Re: Accessibility of man pages

2023-04-08 Thread Ingo Schwarze
Hi Dirk,

Dirk Gouders wrote on Sat, Apr 08, 2023 at 09:48:13PM +0200:

> Yes, it's very slow but close to `man -K`:
> 
> find... man -K...
> 
> real 107.45 real 96.34
> user 117.06 user 70.11
> sys 14.43   sys 26.86
> 
> [a thought later]
> 
> Oh, I found something much faster:
> 
> $ time -p find /usr/share/man -type f | xargs bzgrep -l RLIMIT_NOFILE
> [snip]
> 
> real 24.30
> user 32.34
> sys 6.84
> 
> Hmm, perhaps, someone has an explanation for this?

These are all terribly slow IMHO.

For comparison, this happens on my OpenBSD notebook, with more than
five hundred optional software packages installed in addition to the
complete default installation:

   $ time man -k any=RLIMIT_NOFILE
  dup, dup2, dup3(2) - duplicate an existing file descriptor
  getrlimit, setrlimit(2) - control maximum system resource consumption
  sudoers(5) - default sudo security policy plugin
0m00.21s real 0m00.00s user 0m00.03s system

   $ time man -k 'any=rlimit'   
  ps(1) - display process status
  brk, sbrk(2) - change data segment size
  dup, dup2, dup3(2) - duplicate an existing file descriptor
  execve(2) - execute a file
  fork(2) - create a new process
  getdtablecount(2) - get descriptor table count
  getrlimit, setrlimit(2) - control maximum system resource consumption
  mlock, munlock(2) - lock (unlock) physical pages in memory
  mlockall, munlockall(2) - lock (unlock) the address space of a process
  pledge(2) - restrict system operations
  poll, ppoll(2) - synchronous I/O multiplexing
  quotactl(2) - manipulate filesystem quotas
  sigaction(2) - software signal facilities
  getdtablesize(3) - get descriptor table size
  login_cap, login_getclass, login_close, login_getcapbool, login_getcapnum, 
login_getcapsize, login_getcapstr, login_getcaptime, login_getstyle, 
setclasscontext, setusercontext(3) - query login.conf database about a user 
class
  signal, bsd_signal(3) - simplified software signal facilities
  sigvec(3) - software signal facilities
  core(5) - memory image file format
  login.conf(5) - login class capability database
  sudoers(5) - default sudo security policy plugin
  fork1(9) - create a new process
  mi_switch, cpu_switchto(9) - switch to another process context
  0m00.05s real 0m00.01s user 0m00.00s system

   $ time man -k any=RLIMIT_NOFILE 
  dup, dup2, dup3(2) - duplicate an existing file descriptor
  getrlimit, setrlimit(2) - control maximum system resource consumption
  sudoers(5) - default sudo security policy plugin
0m00.01s real 0m00.01s user 0m00.01s system

The effect that the time goes down from 210 milliseconds to 10
milliseconds when doing the search a second time is due to the fact
that the kernel now has the required information in the buffer cache
and no longer needs to read from the rotating disk.  The machine in
question has i5 2.3 GHz processors and 8 GB of RAM, so it's hardly
a high-end machine.

Yours,
  Ingo



[bug #63958] [mdoc] decide how to set up hanging indent in Synopsis sections

2023-04-08 Thread Ingo Schwarze
Update of bug #63958 (project groff):

  Status:None => Duplicate  

___

Follow-up Comment #3:

Your comment #2 seems fair enough in general, but note that i'm not guaranteed
to be the happiest rabbit in the world if that ends up giving me a 31n
indentation in X509_STORE_CTX_get_ex_new_index(3),
see https://man.openbsd.org/X509_STORE_CTX_get_ex_new_index.3 .

Oui, j'admet franchement qu'ils sont fous, ces Romains!


Besides, i don't think this has anything to with subsection heading
indentation (which is 3n in both mdoc(7) and man(7)) nor with base paragraph
indentation (which is traditionally 7n in man(7) and 5n in mdoc(7)).  Please
do not change the mdoc(7) indentation from 5n to 7n.  I claim the 5n
indentation is objectively better.  7n wastes too much screen real estate. 
The only reason i didn't reduce the 7n to 5n in mandoc(1) is groff
compatibility.  But mandoc(1) does provide an option "-O mdoc" which, among
other minor tweaks, implies "-O indent=5".


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #63958] [mdoc] decide how to set up hanging indent in Synopsis sections

2023-04-08 Thread Ingo Schwarze
Follow-up Comment #1, bug #63958 (project groff):

1. mandoc(1) uses 4n, presumably for compatibility with groff.
2. That's only because mandoc(1) does not support mixed-case section names
yet.  Apply s/Synopsis/SYNOPSIS/ and mandoc(1) works as expected.  Yes, this
will be fixed in mandoc(1), hopefully not in the too far future.
3. I doubt that.  Apply s/void/unicorn/ and the indentation remains 4n.
4. Do not use .Bl in the SYNOPSIS, in particular not for such an extremely
simple case.  The mdoc(7) code is OK as is.  I recommend using .Fo if there
are any long arguments or more than two arguments, but that's merely a weak
stylistic recommendation for readability of the source code.  See below for
how i would write this thing.


.Ft void
.Fo timerdday
.Fa "struct timespec *earliest"
.Fa "struct timespec *latest"
-Fa "struct timespec *resolution"
.Fc


Using \% in a SYNOPSIS seems bogus to me.  The program ought to do the right
thing by default, and i believe mandoc(1) does.  If groff hyphenates type or
argument names, it should be fixed.

Having a hyphen in "struct timespec" seems even more bogus to me; is that even
valid C syntax?

Then again, i don't recall ever seeing a function called timerdday(3) before. 
I neither found it in the OpenBSD nor in the FreeBSD tree.  I would be quite
surprised if it were, as you seem to claim, a BSD thingy.


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




Re: Don't despair! (was: [groff] 06/23: [man pages]: Define page-local `MR` fallback.)

2023-02-24 Thread Ingo Schwarze
Hi Branden,

very brief feedback:  If all you care about is new groff, old groff,
mandoc and Heirloom, then what your committed does indeed work,
but so would this much simpler version, so i assumed you were
aiming higher:

  .\" Define fallback for groff 1.23's MR macro if the system lacks it.
  .if !d MR \{\
  .  de MR
  [...]

I noticed that you put "mandoc" in a comment on a line that mandoc
does not need (and that, incidentally, no other formatter you are
targetting needs either).

Besides, i concluded from the comment "non-groff *roff" that you
were targetting other formatters besides groff, mandoc and
Heirloom, and i wondered what those might be.
Now it appears you targeted none beyond those three.

>> I would have prefered being wrong, but here we are.  :-(

> I think you are in fact wrong here,

What a relief!  =:c)

Yours,
  Ingo



Re: [groff] 06/23: [man pages]: Define page-local `MR` fallback.

2023-02-24 Thread Ingo Schwarze
Hi Branden,

sorry for being totally swamped right now and temporarily losing
track of mandoc and groff stuff, but by some strange chance, this
caught my eye:

> +.\" Define fallback for groff 1.23's MR macro if the system lacks it.
> +.nr do-fallback 0
> +.if !\n(.f   .nr do-fallback 1 \" mandoc
> +.if  \n(.g .if !d MR .nr do-fallback 1 \" older groff

This is excessively complicated.  Mandoc has been supporting d and !d
for a very long time:

  schwarze@isnote $ mandoc -Tascii
  "if !d" works:
  .if !d MR yes
  .if  d MR no
  ()  ()

  "if !d" works: yes

Besides, in general, using \n(.g is always an extremely bad idea.
Capabilities of implementations (both groff and others) change
over time, so never test for application names or version numbers,
but always test for features.

For example, mandoc supports vast swaths of groff features,
and yet:

  schwarze@isnote $ mandoc -Tascii 
  .ie \(.g groff
  .el not groff
  ()  ()

  not groff

> +.if !\n(.g   .nr do-fallback 1 \" non-groff *roff

What exactly is this aiming for?

  schwarze@isnote $ /usr/local/heirloom-doctools/bin/nroff
  "if !d" works:
  .if !d MR yes
  .if  d MR no
  "if !d" works:  yes

Looks like Heirloom does not need it.

  schwarze@isnote $ /usr/local/heirloom-doctools/bin/nroff 
  .if !\n(.g   .nr do-fallback 1 \" non-groff *roff
  .if \n[do-fallback]  doing fallback
  .if !\n[do-fallback] not doing fallback
  do-fallback] not doing fallback

Looks like Heirloom blows up quite loudly with your
new "compatibility" code.

  schwarze@isnote $ /usr/local/plan9/bin/nroff 
  "if !d" works:
  .if !d MR yes
  .if  d MR no
  "if !d" works:

Admittedly, Plan 9 supports neither d nor !d.
But your "compatibility" code does not work either:

  schwarze@isnote $ /usr/local/plan9/bin/nroff 
  .if !\n(.g   .nr do-fallback 1 \" non-groff *roff
  .if \n[do-fallback]  doing fallback
  .if !\n[do-fallback] not doing fallback
  do-fallback] not doing fallback

I did warn you that .MR might land you in trouble...

I would have prefered being wrong, but here we are.  :-(

Yours,
  Ingo



Re: man(7), hyphen, and minus

2022-12-29 Thread Ingo Schwarze
Hi,

Russ Allbery wrote on Sat, Dec 24, 2022 at 02:43:44PM -0800:
> "G. Branden Robinson"  writes:
>> At 2022-12-23T12:49:15-0800, Russ Allbery wrote:

>>> I've been curious: how much use do you see of groff outside
>>> of man pages?

I use it for the slides of all my conference presentations,
and it happened more than once to me that other speakers
approached me at conferences saying, "by the way, thanks for
documenting in the source code of your slides to the *foobar*
conference how to make slides with groff.  I tried it and it
works great for me."

All the same, i admit relatively few speakers do that.

[...]
> for instance, I still use Usenet).

Even though i have to admit that i stopped using Usenet long ago,
probably more than 20 years ago, i'd still like to thank you for
the work you did on Usenet.  I had a lot of fun with it during
the mid-90ies.

[...]
>> Heirloom Doctools is a descendant of AT troff; among other things, it
>> provides its own man(7) implementation, a lineal descendant of Doug
>> McIlroy's 1979 original.  It _can_ and _does_ render man pages.  Whether
>> any *nix distribution ("platform"?) ships Heirloom as its sole or
>> preferred *roff, I don't know.  I wouldn't be surprised if at least one
>> BSD does, for the usual reasons of GPL antipathy[2].
>> [2] The CDDL is way _more_ free than the GNU GPL, you see, because it is
>> a copyleft _and_ has a choice-of-law clause, and someday the BSDs
>> will have an island microstate nullifying all copyleft licenses.

I would be very surprised if there were any BSD system using Heirloom
roff "as its sole or preferred *roff", for two reasons:

 1. The UC Berkeley Computer Systems Research Group gradually switched
from Kernighan's device independent troff to GNU troff as the
preferred *roff around 4.4BSD times, supporting both in parallel
for some time.  The switch was completed by removing the non-free
AT code from 4.4BSD-Lite.  Consequently, all BSDs after the CSRG
originally used groff as their preferred *roff.  If any would now
use Heirloom, that would have required switching from groff to
Heirloom at a time later than 4.4BSD-Lite2, which i would regard
as very astonishing.

 2. The list of existing BSD systems is a few orders of magnitude shorter
than the list of existing Linux distributions.  Basically, here
is a complete list, chronologically by project start:

 - NetBSD: provides groff (1.19.2) and mandoc, prefers mandoc for man(1)
 - FreeBSD: provides mandoc only in base,
at least groff and Heirloom in ports
 - OpenBSD: provides mandoc only in base,
groff and Heirloom and Plan9 as ports
 - DragonFly: provides mandoc only in base,
  at least groff and Heirloom in ports
 - various FreeBSD derivatives; these frequently sync with FreeBSD
   and generally follow FreeBSD in most respects.  I would be quite
   surprised if any of them made major decisions regarding manual
   pages and/or typesetting software.

That's it, basically.  Yes, there may be a few obscure one-man
thingies, but calling any of those a "BSD project" would be a bit
of a stretch.  For example, http://www.mirbsd.org/ starts with:
 "MirBSD is mirabilos' Open Source playground."
So it doesn't even claim to be an operating system project, and
i'm not going to try and do research on projects of that size.

So i'm convinced *BSD is firmly in the "groff and mandoc" camp as
far as "preferred" goes - of course you can use Heirloom on any *BSD,
and there may be good reasons to, occasionally - but certainly not
for manual pages, that would be a really dumb idea.

The license question isn't really relevant here, not even in OpenBSD.
Even OpenBSD kept GPLv2 groff in base as long as it was needed there -
even though OpenBSD is probably the most strict among the BSD projects
when it comes to rejecting non-free licenses like GPL and CDDL.
Hell, OpenBSD even includes CLANG even though it is non-free software
(under Apache 2 license - not quite as horrific as any version of the
GPL or CDDL, but still bad enough).  When there is no choice,
compromises need to be made, unfortunately.

> I am sad that currently Pod::Man is one of the impediments to good
> rendering of manual pages in other formats, since I make use of more of
> the *roff language (mostly to work around bugs) than those tools often
> understand.

Actually, pod2man(1) is by far the best man(7) code generator i have
seen so far.  Getting it supported was among the first things i did
in mandoc, about ten years ago, and it was not terribly difficult.

> So I have an incentive to want to simplify the output as much
> as I can, consistent with remaining portable.

Yes, if you can make it even cleaner, that will certainly be worthwhile
and welcome, but don't fall prey to the misconception that pod2man(1)
were somehow bad and/or a significant origin of trouble.

About 90% of trouble 

[bug #63076] Adding Russian language to groff

2022-09-17 Thread Ingo Schwarze
Update of bug #63076 (project groff):

Severity:  3 - Normal => 1 - Wish   
  Item Group:None => Feature change 

___

Follow-up Comment #2:

Someone has to do the work, preferably someone who has a basic understanding
of three subjects: groff, the Russian language, and contributing to Free
Software projects.  Professional skills are likely not required in any of
these three areas, but some time and diligence are.

I doubt that any of the groff developers would be opposed to improving support
for the Russian language (or any other language, no matter how many people
speak it), if somebody interested takes the lead in developing the needed
patches and helps with integration and ongoing maintenance.


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




Re: Thoughts on tbl(1)

2022-09-06 Thread Ingo Schwarze
Hi Alejandro & Branden,

G. Branden Robinson wrote on Tue, Sep 06, 2022 at 07:49:58AM -0500:
> At 2022-09-06T13:37:39+0200, Alejandro Colomar wrote:

>> I was wondering if tbl(1) wouldn't be better split into tbl(1) and
>> groff_tbl(7)
> [...]
>> I'd like to be able to refer to tbl(7) as a language when talking
>> about it as a language.

> That's a reasonable request.  I think Ingo already does this in mandoc.

Not quite.

The mandoc package does provide a tbl(7) language manual, but it does
not provide a tbl(1) command line command, and consequently no tbl(1)
manual page either.

[...]
>> And, I think it also makes sense to separate documentation about the
>> command and its options from documentation about the language.

The GNU tbl(1) command has no options, which is actually laudable.
Well, there is the rather obscure -C option.  I say obscure because
even formatting the oldest manual pages that ever used tbl(7),
those in PWB and v7, does not need -C.

I think if a command has no options except a compatibility option
that is almost never needed even in extreme cases where one might
expect that they might possibly require compatibility, it is fair
to say that for practical purposes, it has no option.

Doug is right that a manual for only the command line command
would be almost empty.

Yours,
  Ingo



[bug #49390] last line of boxed tables overprinted on nroff devices

2022-08-28 Thread Ingo Schwarze
Follow-up Comment #7, bug #49390 (project groff):

Fix merged to mandoc in
https://cvsweb.bsd.lv/mandoc/tbl_term.c#rev1.79
https://marc.info/?l=mandoc-source=166168405927772


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #62841] [man] stop forcing vertical space before tbl(1) tables

2022-08-28 Thread Ingo Schwarze
Follow-up Comment #3, bug #62841 (project groff):

Merged to mandoc in:
https://cvsweb.bsd.lv/mandoc/man_term.c#rev1.241
https://marc.info/?l=mandoc-source=166168038326866


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #62841] [man] stop forcing vertical space before tbl(1) tables

2022-08-28 Thread Ingo Schwarze
Update of bug #62841 (project groff):

Severity:  3 - Normal => 4 - Important  
  Item Group: Rendering/Cosmetics => Feature change 

___

Follow-up Comment #2:

This is an incompatible change, changing the output for large numbers of
existing manual pages.  So in addition to asking "Does this change make
writing new pages simpler and more flexible and more consistent?", which
gbranden@ answers affirmatively below (and i agree with him so far), another
reasonable question to ask is "Does it typically improve or degrade the
rendering of existing real-world manual pages?"

As a random sample, i took the manual pages in the OpenBSD base system and
individually looked at each one in turn.  Here are the results (i stopped
after about one third of X11 because a clear pattern is emerging; all non-X11
pages were inspected though):

now looks better because it expects tbl to directly follow preceding text:
  XAllocSizeHints(3) XAllocStandardColormap(3) XChangeKeyboardControl(3)
  XConfigureWindow(3) XCreateWindow(3) XGetVisualInfo(3)

now looks better because of existing .sp before .TS:
  curs_getch(3)
  bitmap(1) editres(1)
  GLwDrawingArea(3) GLwMDrawingArea(3)
  XkbActionCtrls(3) XkbAllocCompatMap(3) XkbBell(3) XkbAllocControls(3)

now looks worse because of missing .PP:
  captoinfo(1) infocmp(1)
  curs_inch(3)
  xterm(1) Yserver(1)
  XkbAllocClientMap(3)

inconsistent use within the same page, parts better, parts worse:
  xedit(1)
  XAllocWMHints(3) XCreateGC(3)

now looks worse because of explicit trailing .sp 2v:
  tbl)7)

output unchanged:
  llvm-objcopy(1)
  curs_addch(3) curs_attr(3) curs_mouse(3) curses(3) form(3) menu(3)
  phantasia(6)
  mkhybrid(8)
  xcalc(1)
  XDrawArc(3) XGetWindowAttributes(3) XQueryColor(3)

irrelevant, uses very bad formatting in the first place:
  terminfo(5)

To summarize, existing usage is *wildly* inconsistent among real-world pages. 
There is a significant minority of pages that look worse after this change,
but that cannot really be called a regression because there are similar, and
possibly larger, numbers that look better.

To express the same conclusion in a different way, it appears very few manual
page authors understood how spacing before .TS was supposed to work, which
implies that changing the rules mid-game and simplifying them makes sense. 
The benefit of simplicity and consistency in the rules clearly outweighs the
occasional formatting degradation, which is besides easy to fix by adding .PP
before .TS in the pages in question.

I'll follow with mandoc(1).

I did change the "Item Group" from "Cosmetics" to "Feature Change" though
because it is a man(7) API change.  It changes the semantics off the .TS
macro, it changes how authors have to use .TS in manual pages, and it needs to
be announced as a feature change that is imposed *without providing backward
compatibility*.  (And i don't think backward compatibility would be
feasible.)

I also changed the "Severity" from "Normal" to "Important": if we have to tell
users that they need to change their finger memory, that is important.


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




Re: groff maintainership, release, and blockers

2022-08-27 Thread Ingo Schwarze
Hi Branden,

G. Branden Robinson wrote on Sat, Aug 27, 2022 at 06:55:13AM -0500:
> At 2022-08-27T12:49:05+0200, Ingo Schwarze wrote:

>> There remains a regression in man(7) .EX/.EE which i will report ASAP.

> If it's this one, I already pushed a fix days ago.
> 
> commit f287bb7243a7d77e8b4f3a432d00c6f0681b687d
> Author: G. Branden Robinson 
> Date:   Tue Aug 23 12:56:02 2022 -0500
> 
> [man]: Restore robustness to `EE` misuse.
[...]
> This regressed post-1.22.4.  Thanks to Ingo Schwarze for the report and
> a proposed patch.

The remaining problem is that you only committed half of the fix.

The .nf or .fi, respectively, needs to be done unconditionally, like
it was in my original minimal patch, even if an*is-in-example is in the
wrong state, such that even a stray .EX or .EE at least produces a break.

In git master,

  inside
  .EE
  outside

with no .EX now renders as

  inside outside

whereas traditionally, it rendered as

  inside
  outside

The traditional behaviour was better because if the author mistakenly
thinks that an .EX display is open and closes it, that's an unambiguous
signal that they do *not* want to continue output on the same output line.

Yours,
  Ingo



[bug #62926] [mdoc] align styling of titles and man page cross references with man(7)

2022-08-27 Thread Ingo Schwarze
Follow-up Comment #6, bug #62926 (project groff):

It seems CS, CT, D, HY, LL, LT, and S are GNU only, but cR already appeared in
4.4BSD.
So indeed, the horse is barely in sight of the barn any longer, let alone
inside of it.  No, i wasn't aware of that.

In this situation, while MF does add even more noise to the manual page (noise
which authors don't need and users will not usually even look at), it does not
seem to make the situation that much worse than it already is, it merely
continues one more step down the declining slope, as long as you don't break
.Nm.

P.S.
My concept of API is quite simple: it's the syntax and semantics described in
the manual page - because that's what users of the language need to review.


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




Re: Warn about long lines

2022-08-27 Thread Ingo Schwarze
Hi Alejandro,

Alejandro Colomar wrote on Fri, Aug 26, 2022 at 08:51:15PM +0200:
> On 8/26/22 08:18, Ralph Corderoy wrote:

>> - If it's groff, then use ‘-rLL=80n’; see groff_man(7).

> Ahh, this is what I needed.  I sometimes struggle to understand how 
> groff divides the implementation.  Why is LL only documented in 
> (implemented for?) groff_man(7)?

Because it is a groff_man(7) feature only -
that is, groff(1) only *and* man(7) only.

> I wouldn't have expected it there.

Your expectation is indeed resonable.

As an aside, mandoc(1) syntax was designed to satisfy exactly that
expectation, see the mandoc -O width= argument here:
https://man.openbsd.org/mandoc.1#ASCII_Output

The reason i took the liberty to use the mandoc-specific -O option
for this purpose is that i failed to find a command line option in
groff doing what a reasoable user would expect and might indeed need.

> It doesn't seem like a man(7)-specific thing (it may be implemented for 
> -man only, but that seems non-obvious to me).  I mean, when searching 
> for an option that controls the line length, I expect it to be a generic 
> option that will be applicable to groff as a whole, and not to a 
> specific macro set.  I fail to find documentation about these things for 
> that reason.

The generic feature to control the line length is the roff(7) .ll request.

   $ cat tmp.roff 
  some words on a single input line
   $ (echo .ll 12; cat tmp.roff) | groff -T ascii | head -n 3
  some   words
  on a  single
  input line
   $ echo .ll 12 | groff -T ascii - tmp.roff | head -n 3
  some   words
  on a  single
  input line

I'm not aware of a groff option to merge parts of the command line 
into the groff input stream - well, -r is a bit like that, but only
for registers, not for requests.

Yours,
  Ingo



Re: groff maintainership, release, and blockers

2022-08-27 Thread Ingo Schwarze
Hi Bertrand,

Bertrand Garrigues via wrote on Fri, Aug 26, 2022 at 11:53:07PM +0200:
> On Fri, Aug 26 2022 at 02:04:57 PM, G. Branden Robinson wrote:
>> At 2022-08-26T13:51:25+0200, Ingo Schwarze wrote:

>>> In particular, i'm firmly convinced that issuing an RC while even one
>>> single blocker issue is unresolved is a blatant contradiction.  Before
>>> an RC, all blockers must either be resolved or explicitly
>>> re-classified as "not release critical" and re-scheduled for the
>>> subsequent release.

> I understand your point Ingo, however the rc1 tag is almost 2 years old,

Yes, no argument about the RC1 tag going wrong and being useless now.

> so I feel we need to make a new tag now, and from this tag decide
> which bugs must absolutely be fixed.

I'm not convinced deciding which issues shall be fixed and which postponed
needs a tag, nor that calling such a tag an "RC" is easy to understand,
but i think Branden is right that's only a terminolgical argument
after all.

> I won't release any official 1.23.0 if you consider there is a blocker
> or that the mandoc is not in a good shape.

Actually, i deem the state of mandoc irrelevant in this context.  It is
perfectly fine to release groff while mandoc is in an unstable state.
(As a matter of fact, mandoc is not close to release-ready right now:
it has various open bugs and some unfinished work, but that is off-topic
here.)

> For sure there will be some bug fixes after rc2 and we'll have an rc3,
> so it's perhaps not exactly a "Release Candidate", it's a kind of
> "intermediate tag" or an "alpha release", but I'll still name it rc2,
> for the sake of simplicity.

Fair enough, file name consistency is a plus.
It is desirable for the announcement be clear about the intended
purpose of RC2, though.

> Could you please detail here what is your list of blockers that you
> think must be absolutely fixed before the official 1.23.0 release?

I cannot give a final list off the top of my head because i am aware
of several candiadates of regressions that require analysis.

Are you planning to move towards an RC during the next two to three
weeks?  In that case, i should probably priotize triage over fixing
which i usually avoid (if possible) because it increases the total
working time consumed.  But it would speed up drafting a list of
potential issues that might be worth fixing before release.


I recommend investigating this one before RC2, if feasible:

#62918 Wrong GhostScript version reported during build

There remains a regression in man(7) .EX/.EE which i will report ASAP.

Even though this is documentation only, i recommend resolving it before
RC3 because switching back and forth in terminology in three consecutive
releases could be regarded as awkward:

#62816 rename the \& escape...again

I recommend postponing these until after release:

#62774 [mdoc] warn if any of `Dd`, `Dt`, `Os` not called
#62926 [mdoc] align styling of titles and man page cross references with man(7)
#62933 [man] produce hyperlinks in PDF output

Yours,
  Ingo



Re: groff maintainership, release, and blockers (was: groff 1.23.0.rc2 readiness)

2022-08-27 Thread Ingo Schwarze
Hi Branden,

G. Branden Robinson wrote on Fri, Aug 26, 2022 at 02:04:57PM -0500:

[...]
> The FSF provides useful infrastructure.

Fair point.  You are right that the effort required to run servers for
a VCS, ticket handling, web, and mail is not negligible.  So if the
people actually doing the administrative work think the FSF services
are worth spending a few months on processes now and then, so be it.
Of course i do *not* advocate selling out to some commercial hosting
service (like github), or to some third-party service with a totally
insane, unusable API (like github).

[...]
> The FSF has its problems but selling out to a copyright rentier firm
> seems like a pretty low risk.

Even i consider that particular risk as so low that it doesn't matter.
And even if it happened, the FSF could still be abandoned at *that*
point.

[...]
> The grass isn't so green.  In my experience nearly everything in
> software management that looks "agile" and low-friction is that way
> because there is some serious infrastructure beneath that people have
> worked hard to make unobtrusive.

That rings very true to me.


> At 2022-08-26T13:51:25+0200, Ingo Schwarze wrote:
>> Branden wrote:

[...]
>>> But for the sake of transparency, in the meantime, he asked if the
>>> current HEAD was good enough to tag as "rc2" and I said "yes".

>> Sorry, i fail to understand that.  The acronym "RC" stands for "release
>> candidate".  I would define a "release candidate" as "a version that
>> is believed to be ready for release".

> Apparently we have a terminological and/or philosophical disagreement.

The conversation below reveals that indeed the majority of the points
i raised were caused by me misunderstanding what you meant when
saying "RC".

> My objective since Bertrand added the first automatic tests before the
> groff 1.22.4 release has been to _never_ have Git HEAD in a state where
> _any_ tests fail,

That's a worthy goal - incidentally, i do the same for mandoc, and
i think that is *very* common practice in most free or commercial
software contexts.

It has nothing to do with whether or not a tree is in a state that
is close to mature enough for a release in the near future.  Even
if a project has an above-average regression suite, the tree can easily
be in a state that is highly unstable and likely contains many new
regressions even when the test suite succeeds.

Slowly adding tests to groff is not a bad idea, but right now, the
groff test suite still has close to zero coverage, so it is almost
meaningless in this respect.

> Therefore, by that standard, any commit not marked "Test fails at this
> commit."..._is a release candidate_.

I consider that a ridiculous standard.  What's the point of
having a term for something if the defining property is trivial?
Or expressing the same question differently, why have a term that
means nothing?  In particular if the term is commonly used in a
completely different sense in the context.

But i do admit that disagreement is purely terminological, so we won't
die from not resolving it.

> On the other hand, that statement is unrealistic.  We don't have a
> regression test for every defect in groff because, like any
> non-formally-verified codebase of non-trivial complexity, groff has bugs
> we don't know about and therefore cannot test for.  It can also have
> bugs that we know about but don't understand well enough to a write a
> test for, and bugs that manifest only on platforms or configurations
> that its regular developers don't test.

Indeed.  It is by definition impossible to measure and/or prove
the density of such bugs.  But sometimes, one does have reasonable
grounds for an informal, non-quantitive estimate of the density,
and when well-informed people feel it is lower than usual in their
project, they usually say "now would be a good time for release".

> None of these are novel observations; it's why people have "continuous
> integration" infrastructures.

What i'm saying is that even though *functionally*, groff-current
is decisively better than groff-1.22.4 in large numbers of respects,
my impression is the regression density in groff has, during the last
two years, never been as high as it is right now.  I feel less sure
about the years before, but if i remember correctly, that statement
not only applies to the last two years, but to the last decade.

I would call such a state "aggressive unstable development", i.e. the
direct opposite of "beta", let alone "RC".  My question is: how do we
get from an unstable development state to an RC-ready state?

Then again, you seem to disagree that the current development state
is unstable, in which case maybe mopping of the remaining known
puddles and then releasing is not unreas

Re: groff 1.23.0.rc2 readiness

2022-08-26 Thread Ingo Schwarze
Hi Branden,

G. Branden Robinson wrote on Wed, Aug 24, 2022 at 07:11:10AM -0500:
> At 2022-08-24T21:18:02+1000, John Gardner wrote:
>> Bertrand wrote:

>>> As you are the most active developer, would you consider taking over
>>> the maintainership of groff?

>> Please, please, *please* let that be a "yes"…

> Bertrand and I are in private correspondence about this.
> One or both of us will follow up to the list as things progress.

Very good news, this is nice.
Also, my best wishes for Bertrands further and speedy recovery!

> There's a lot of FSF procedure to deal with.

Wouldn't it be better to simply abandon the the GNU roff project
(i.e. leaving the FSF with no developer whatsoever), fork groff under
a new name (say, "GPL roff"), and continue that new project outside
the FSF?  This is not the first time FSF red tape has proven onerous
to the point of creating major distractions from and significant
obstacles to real development work.

Of course, the FSF would retain copyright of the existing GPLv3+ code.
But that's not a problem because a free software license, once granted,
cannot be retracted.  Future contributors could simply contribute in
their own name, without transfering Copyright, under GPLv3+.

Of course, if most developers love the FSF, i'm not pressing for that
change as i'm only mildly affected by the red tape.  I believe
some others suffer more from it than i do (even though it did
cause me some waste of time on a few occasions).

> But for the sake of transparency, in the meantime, he asked if the
> current HEAD was good enough to tag as "rc2" and I said "yes".

Sorry, i fail to understand that.  The acronym "RC" stands for "release
candidate".  I would define a "release candidate" as "a version that
is believed to be ready for release".  The purpose of an RC is to have
it tested on as many platforms and for as many different purposes as
possible, to confirm that indeed no undiscovered regressions exist
and that in particular the last few commits made before the RC did not
cause regressions.  These tests cause non-trivial work for significant
numbers of people, most of whom are *not* groff developers, so an RC
should only be made when the software is really believed to be ready -
both out of respect for testers' time and because releasing multiple
RCs will weary out testers and increase the likelihood of serious
bugs slipping into the release: some testers will not have the time
to test over and over again, so the more RCs you ship, the less test
coverage you get.

In particular, i'm firmly convinced that issuing an RC while even one
single blocker issue is unresolved is a blatant contradiction.  Before an
RC, all blockers must either be resolved or explicitly re-classified as
"not release critical" and re-scheduled for the subsequent release.
After the RC, it is the critical to not commit anything except fixes
for critical regressions that people reported from RC testing.
In particular, after an RC, no bugs must be fixed that were already
known before the RC was sent out.

Not only do we have a significant numbers of open blockers, but i
also reported that the mandoc test suite found thirty-seven changes
of behaviour between the last groff release and groff-current that
i did not find the time to analyze just jet (in addition to changes
that i already investigated and that turned out to be in part groff
regressions, in part mandoc bugs, and in part intentional and useful
changes in groff behaviour).

So it is totally obvious to me that the code base is *not* in a good
shape and quite far from being ready for an RC.

Of course we shouldn't defer the relase until after Judgement Day,
and it does help to have a rough idea about when we wish to get ready
for an RC.  I think if we decide now "no more commits except those
considered release-critical" then we will need at least two or three
weeks to get to a state that might support an RC, with some luck.
If things go not so well, we might need the month of October for
sorting out the remaining issues, too, but aiming for an RC between
late September and late October would seem reasonable to me.  In that
case, even if the RC reveals a so far undiscovered, major regression
(or a small number of such), time should be sufficient to fix those
and release before the beginning of December.  This is not intended
as a rigid plan, but as a rough idea what we are aiming for.

> I'd like to encourage people with an interest in groff to step up and
> become contributors if they can.  I think having the "lead developer"
> (or "most active" one, at any rate) and "release manager" be separate
> people can be a good separation of concerns.
> 
> In a peer relationship where there's mutual respect and comity, having
> one person focussed on "this is good, let's make this a release,
> announce the good work that's been done, and get it in people's hands"
> and other on "we can make it even better: let's fix this bug and improve
> that feature" can be a healthy 

[bug #62926] [mdoc] align styling of titles and man page cross references with man(7)

2022-08-25 Thread Ingo Schwarze
Follow-up Comment #3, bug #62926 (project groff):

[comment #2 comment #2:]
 
> It should surprise no one that my idea is to support the `MF` string in
_mdoc_(7) just as is already done in _man_(7).

Yikes.  Are you aware that the mdoc(7) language API does not contain a single
user-visible register yet?
Starting to go down that rabbit hole seems like a very bad idea to me.

> I reckon this would be used to style `Xr`'s first argument,
> and the page header.

Very grudgingly, yes, if you must pollute the API with \*(MF.

> `Nm`' first (or implied) argument,

No, absolutely not.  This is getting worse and worse!

.Nm has always been formatted as \fB and that is very important because it is
a fixed string that the user has to type verbatim.
It would be extremely confusing to use anything other than \fB for .Nm,
especially in the SYNOPSIS, but also in other places.

> While I'm in the neighborhood, page footers are _already_ out of sync
between man and mdoc

True, and synching that might make sense.  Either way, the footer line is less
important than the header line, so i don't feel strongly about it.

mdoc: OS - date - OS
man:  OS - date - title

Arguably, the man(7) way is more useful and the mdoc(7) way somewhat redundant
in this respect.


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




Re: Warn about long lines

2022-08-24 Thread Ingo Schwarze
Hi Alejandro,

Alejandro Colomar wrote on Mon, Aug 22, 2022 at 12:31:15AM +0200:

> Would you mind adding a warning about this?
> 
> I'm currently doing a global fix in the Linux man-pages turning kernel 
> types like __u64 into the standard uint64_t that user-space programmers 
> expect.  Most of these are used within structure definitions, and so 
> they are within .EX/.EE (non-filled).  I fear that I might be making one 
> of those structure definitions go past the right margin, and there are 
> so many, that it's not funny rendering all of them to check; not even 
> only those that I suspect that might; especially, since some may be 
> deeply indented in .RS/.RE blocks that I may not notice (and that 
> happened at least once --in one that I checked, luckily--).

That is an interesting idea.  I certainly see how it could be useful
for users, without having to run an additional tool or command.

It may not be easy to implement in a useful way, though.
The obvious first idea is to issue the warning in the terminal
formatter, which is part of the program that already does output
column counting.  That, however, would be confusing because "mandoc
-Tlint" never invokes any formatter, so the warning would *not* appear
in its output, nor would it affect the exit code.  If implemented in
the formatter, getting the warning would require a command similar to

  mandoc -T ascii -W all > /dev/null

which i don't really want to recommend.

For now, i am adding this entry to the mandoc TODO file:

 - warn about output lines exceeding 80 characters
   Alejandro Colomar Aug 22, 2022
   not trivial because -T lint does not call any formatter
   loc ***  exist *  algo **  size **  imp **

No guarantee whether or when it can be done.

For the time being, i suggest this workaround:

   $ cd /co/linux-man-pages/
   $ for f in man?/*; do mandoc $f | col -b | grep -E '.{80}' && echo $f; done
  ldap://host.com:/o=University%20of%20Michigan,c=US??sub?(cn=Babs%20Jensen)
  man7/uri.7

  ldap://ldap.itd.umich.edu/o=University%20of%20Michigan,c=US?postalAddress
  ldap://host.com:/o=University%20of%20Michigan,c=US??sub?(cn=Babs%20Jensen)
  man7/url.7

  ldap://ldap.itd.umich.edu/o=University%20of%20Michigan,c=US?postalAddress
  ldap://host.com:/o=University%20of%20Michigan,c=US??sub?(cn=Babs%20Jensen)
  man7/urn.7

  RuleUS  19671973-   AprlastSun2:00w1:00dD
  man8/zic.8

> $ cat longline.man
> .TH a b c d
> .SH foo
> .nf
> this is a very long line that will go past the 80-col right margin, and 
> I want to be warned about it.

Don't worry, Thunderbird hates everyone, not you specifically.
I do wonder why so many use it anyway, though.

Yours,
  Ingo



[bug #62926] [mdoc] align styling of headers and man page cross references with man(7)

2022-08-20 Thread Ingo Schwarze
Follow-up Comment #1, bug #62926 (project groff):

No comment on the first sentence because it is not very specific.

I do not object to the rest of the original submission.  But i strongly
disagree with the summary.  mdoc(7) has always consistently styled .Xr as \fR
for terminal output whereas man(7) has been inconsistent over time.  In v7, it
even styled manual page cross references differently inside and outside of SEE
ALSO.

So if anything, man(7) ought to adopt the established convention of mdoc(7)
rather than infect mdoc(7) with its own instability.


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #62918] Wrong GhostScript version reported during build

2022-08-19 Thread Ingo Schwarze
Follow-up Comment #2, bug #62918 (project groff):

[comment #1 comment #1:]
> I will be adding a new program, pdfmake,

Please don't.

Adding yet another program for each afterthought forgotten in the original UI
design ruins an API by adding more and more complexity and accumulating more
and more technical debt.

There is only one job: generating a PDF file from roff(7) input, so there
should only be one user-visible program doing that.  But we already have
three: groff -T pdf (which itself is a wrapper around troff(1) and
gropdf(1)!), pdfmom(1), and pdfroff(1), which really is a shame.

Considering that groff(1) already is a wrapper, the best design would clearly
be to make groff -T pdf do the right thing and get rid of the wrappers
wrapping wrappers.

But even if such a cleanup of the design and implementation would cause too
much work right now, adding *yet another* wrapper around wrappers around
wrappers sounds like an absolutely terrible idea to me.  The minimum amount of
design work to be done when adding a new feature is to make sure that it does
not ruin the quality of the already dubiously designed user interface even
further, IMHO.



___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




man(7) .EE vertical spacing regression

2022-08-19 Thread Ingo Schwarze
Hi Branden,

the following commit caused a regression:

  commit 15f8188656ef0ebed797eb5981b012b590fc77ad
  Author: G. Branden Robinson 
  Date:   Wed Feb 16 19:49:58 2022 +1100

[man]: Refactor `EX` and `EE` macros.

Consider the following test file:

  .TH TEST 1
  .SH NAME
  test \- test
  .SH DESCRIPTION
  initial text
  .EX
  .EE
  .EE
  .PP
  final text

The problem is that starting with this commit, .EE does

  .rr an*saved-paragraph-distance

such that the second .EE now does

  .nr PD 0

whereas before the commit, it would have reused the value saved
by the previous .EX macro.  So before your commit, there was a blank
output line before the "final text" (as expected due to .PP) whereas
after your commit, that blank line is gone.

Admittedly, behaviour wasn't quite correct even before your commit.
For example,

  .SH DESCRIPTION
  initial text
  .EE
  .PP
  final text

never printed the blank line because without any prior .EX, .EE
resulted in .PD 0 anyway.  Also,

  .SH DESCRIPTION
  initial text
  .EX
  .EE
  .PD 2
  .EE
  .PP
  final text

was already mishandled before your commit because the second .EE
neutered the effect of the explicit .PD 2.

I think the bast way to fix all this is to make sure that .EE
only manipulates settings when an .EX block is actually open,
see the patch below.

As a regression fix patch, i intentionally kept the patch minimal
for review.  If you want, feel free to emit diagnostics in these
cases, which is likely easy.

i don't think it would make sense to support nesting of .EX blocks,
both because use cases for .EX do not require nesting and because
man(7) supports nesting for very few block macros anyway (.RS being
the notable exception).

What do you think?
Do you want to polish and commit it, or should i push it?

Yours,
  Ingo

P.S.:
After doing another "git pull", the number of new regression
failures in the mandoc test suite just went up from about half a
dozen to thirty-seven.  Most of those are probably due to intentional
changes in groff of vertical spacing around tbl(1) blocks.  Still,
i'll have to check whether these changes indeed *all* make sense.
So we are not exactly getting closer to a stable state that might
be good enough for release.  Then again, if we have given up hope of
releasing any time soon, that might not be a problem...


diff --git a/tmac/an.tmac b/tmac/an.tmac
index cf28fc3ce..fbc8cf845 100644
--- a/tmac/an.tmac
+++ b/tmac/an.tmac
@@ -986,11 +986,12 @@ contains unsupported escape sequence
 .
 .\" Begin an example (typically of source code or shell input).
 .de1 EX
+.  nf
+.  if ran*saved-font .return
 .  ds an*saved-family \\n[.fam]
 .  nr an*saved-font \\n[.f]
 .  nr an*saved-paragraph-distance \\n[PD]
 .  nr PD 1v
-.  nf
 .  \" If using the DVI output device, we have no constant-width fonts of
 .  \" bold weight and, relatedly, no constant-width family (because that
 .  \" requires all four styles).  Remap the bold styles to normal ones.
@@ -1006,6 +1007,8 @@ contains unsupported escape sequence
 .
 .\" End example.
 .de EE
+.  fi
+.  if !ran*saved-font .return
 .  \" Undo the remappings from `EX`.
 .  ie '\*[.T]'dvi' \{\
 .ftr R
@@ -1016,7 +1019,6 @@ contains unsupported escape sequence
 .  fam \\*[an*saved-family]
 .  ft \\n[an*saved-font]
 .  nr PD \\n[an*saved-paragraph-distance]
-.  fi
 .  rr an*saved-paragraph-distance
 .  rr an*saved-font
 .  rm an*saved-family



Re: Converting between macro sets?

2022-08-18 Thread Ingo Schwarze
Hi Robert,

Robert Goulding wrote on Thu, Aug 18, 2022 at 02:08:23PM -0400:

> Just out of curiosity, has anyone written a script to, say, convert text
> written with -ms to -me?

Not that i know of.  However, i do maintain programs to convert

 * from -mdoc  to -man:  https://man.openbsd.org/mandoc.1#Man_Output
 * from perlpod(1) to -mdoc: https://mandoc.bsd.lv/pod2mdoc/
 * from texinfo(5) to -mdoc; https://mandoc.bsd.lv/texi2mdoc/
 * from DocBookto -mdoc: https://mandoc.bsd.lv/docbook2mdoc/

Yours,
  Ingo



Re: Standardize roff

2022-08-16 Thread Ingo Schwarze
Hi San,

Sam Varshavchik wrote on Sun, Aug 14, 2022 at 08:20:34PM -0400:
> Ingo Schwarze writes:
>> DJ Chase wrote on Sat, Aug 13, 2022 at 05:27:34PM +:

>>> Have we ever considered a de jure *roff standard?

>> No, i think that would be pure madness given the amount of working
>> time available in any of the roff projects.

> I tinkered with something like this some years ago, but I took a slightly  
> different approach.
> 
> I converted man pages

What kind of manual pages?

> from 'roff source to Docbook XML using a … pretty large Perl script.

That sounds very foolish on several levels.

First, and most obviously, you seem to be duplicating esr@'s work
on doclifter:

  http://www.catb.org/~esr/doclifter/
  https://gitlab.com/esr/doclifter/-/blob/master/doclifter

Second, quick and dirty Perl-style parsing is usually not good
enough to parse roff code, and a huge script is not particularly
good for readability and maintainability.

Yes, i know the same resevations would apply to esr@'s work,
which is a giant Python 3 script.  But at least there is some
evidence that his work was able to find significant numbers of
real issues in real manual pages.

> Once a year, or so, when I have nothing better to do I pull the current
> man  page tarball and reconvert it. I usually need to tinker the Perl
> script, here and there, each time.
> 
> The Docbook folks provide a stylesheet that converts Docbook XML
> back to 'roff.

Yikes.  That thing is by far the worst man(7) code generator existing
on this planet.  If at all possible, you should avoid that toolchain
like the plague.

It is so bad that for years, bogus reports caused by that totally
broken toolchain have caused the majority of invalid mandoc bug
reports.

> The end result you get is standardized 'roff, whatever that means.

Absolutely not.  The result is utter crap.  It is rarely even
syntactically valid, let alone reasonable style.

> But, yes, the effort require to clean up and standardize the formatting
> of man pages would be mammoth. There's more inconsistency across the
> various man pages, from various sources, than consistency.

That isn't completely untrue, but all the same, mandoc copes well
enough with more than 95% of valid real-world manual pages, and groff
with 100%.  In a nutshell, the only stuff that breaks with groff
is manual pages that are completely invalid, usually coming from
the official DocBook XML toolchain, and in rarer cases coming from
other broken man(7) generators.

All this is barely related to the question of standardizing roff(7),
though.  Roff is much more than manual pages.

Yours,
  Ingo



Re: Standardize roff (was: *roff `\~` support)

2022-08-16 Thread Ingo Schwarze
Hi Alejandro,

Alejandro Colomar wrote on Mon, Aug 15, 2022 at 01:59:24PM +0200:
> On 8/14/22 21:43, DJ Chase wrote:

>> Do you think that a descriptive/trailing standard could be beneficial
>> or would you still say that it could mostly hinder *roff
>> implementations?

When prepared with diligence and without falling for featurism,
it might be useful because the common subset of the major roff
implementations is large enough that it would likely be possibly
prepare portable roff documents following such a standard.

However, such a standard could likely *not* include *any* of the
best features of any of the implementations: yes, implementations
have diverged that much - not quite as bad as make(1), but still more
than many other classical Unix programs.  Consequently, only authors
with modest needs could possibly consider adhering to the standard.
To provide some striking examples, the standard could include neither
the mom(7) macro set - which is a killer feature of groff - nor the
mdoc(7) macro set - which has been an important feature of groff for
more than 30 years and of mandoc for more than 10 years.

This is all theoretical though - as i explained, the effort required
for developing such a (necessarily seriously stunted) standard is
prohibitive.

[...]
> But we can achieve something very similar by documenting the differences 
> between known roff alternatives somewhere.  And that's likely to be much 
> easier.

That's a much lower bar than a standard, but don't underestimate
the effort involved even in that.

A few very small parts of that already exist.

For example,

  https://mandoc.bsd.lv/man/man.options.1.html

documents command line options of some roff(1) and man(1)
implementations, mostly intended for people who see themselves
forced to invent a new command line option - which should of course
be avoided if at all possible because the tangle of existing options
is already terrifying.

For example,

  https://man.openbsd.org/roff.7

documents roff requests and roff escape sequences; search for
"extension" in that page.  Even though this page focusses on groff,
Heirloom, and mandoc and does not mention Plan 9, neatroff, or other
implementations, the amount of compatibility information scattered
around that page is already larger than what would seem healthy for
most user-facing documentation.  It's OK here because this page is
geared more towards developers than towards users.
Also, note that this page is already very long even though it is
extremely terse - so terse that it is insufficient for learning
how to use most of the features mentioned.

> In the Linux man-pages we document when a function is in ISO C or in 
> POSIX, but also when it's not standardized but present in other Unix 
> systems (so that it has some degree of portability), or when it is 
> Linux-only.  Maybe having something similar in groff's manual pages 
> would be effective.

Except that the bulk, and in particular the core, of groff functionality
is *not* described in manual pages in the first place.  Would you
want to litter groff.texi with compatibility information throughout?
That would likely cause a significant increase in size, almost certainly
a very signifant decrease in maintainability, and possibly it might also
somewhat decrease readability.

> For example, for .MR, we were discussing that probably it would be good 
> to add a note like "(since groff 1.23.0)" and maybe it could also state 
> which other roff (or mandoc) implementations support it.

But that feels like an exception rather than the rule.  It seems
warranted for this particular case because we are introducing a
new feature without consideration for compatibility that will cause
information loss for end-users unless something unusual is done
about it.  Hopefully, we are not going to turn that vice into a habit.

The particular case of .MR is somewhat specific to manual pages, too.
If people prepare a typeset document using many advanced features with
groff or Heirloom, they are used to the fact that it won't work with
the other, nor with Plan 9.  That's not a major problem because most of
the time, the author is the only person who really needs to typeset a
document.  Nowadays, the average reader will only read the PDF version,
which is totally different from the situation with manual pages.

Yours,
  Ingo



Re: TAB character in groff output

2022-08-15 Thread Ingo Schwarze
Hi Alejandro,

Alejandro Colomar wrote on Tue, Aug 16, 2022 at 12:01:40AM +0200:

> Ingo, is mandoc(1) planning to support .MR?

Yes, almost certainly.

I'm not enthusiastic about it, but given that groff is going ahead
with it, it is clearly better to support it than to not support it.

The most likely timing for adding support is shortly after the next
groff release.  Before the groff release, it isn't urgent at all
for obvious reasons.

Right now, i'm slowly working through inconsistencies that popped up
in the mandoc test suite after regenerating the expected output with
-current groff.  Getting that sorted out before the groff release would
be ideal because some of these issues might be regressions in groff
(like the groff_mdoc(7) prologue regressions i reported earlier).
What makes this work a bit tedious is that apparently, not all changes
that popped up are groff regressions.  For example, for the second
change is i looked into, it appears behaviour is mostly consistent
between GNU, Heirloom, and Plan 9 roff and it is mandoc that is off,
so there is no need to report that here and i'm instead fixing mandoc
(it is related to literal tab characters in filled text).

Eleven new differences are left right now and i suspect these
are likely due to at least four and probably not more than eight
different changes; the exact number of issues is not clear yet.
Most are differences in vertical spacing, but in different contexts, so
there is likely more than one vertical spacing issue.  One difference
concerns paragraph breaking, one concerns horizontal spacing, and
two concern the scope of font markup.

Yours,
  Ingo



Re: *roff `\~` support (was: [PATCH 4/6] xattr.7: wfix)

2022-08-14 Thread Ingo Schwarze
Hi Branden,

G. Branden Robinson wrote on Fri, Aug 12, 2022 at 11:23:11PM -0500:
> At 2022-08-12T16:30:01+0200, Ingo Schwarze wrote:

>> There are people using Plan 9 for practical work though, they have
>> even occasionally posted on the groff and mandoc lists, so that is a
>> bit more of a problem.

> plan9port's troff is no longer a problem, thanks to Dan Cross acting on
> my pull request at relativistic speed.
> https://github.com/9fans/plan9port/commit/93f814360076ccf28d33c9cb909fca7200ba4a7d

Nice.  :-)

> I also have a PR pending with Illumos.
> https://github.com/illumos/illumos-gate/pull/83

Illumos isn't doing development on GitHub.

Besides, Illumos is less of a problem because they have been using
mandoc as the default manual page formatter since July 2014.

All the same, getting \~ supported in their general-purpose
roff implementation is no doubt nice to have, too.

That reduces my converns mostly to commercial UNIXes and potentially
to a few ad-hoc conversion tools we are not even aware of.
Consequently, the converns aren't 100% resolved yet but getting
closer to becoming theoretical concerns.  If it's only commercial
UNIXes and unknown tools that may break, the improved typesetting
quality may be worth the risk.

Yours,
  Ingo



Standardize roff (was: *roff `\~` support)

2022-08-14 Thread Ingo Schwarze
Hi,

DJ Chase wrote on Sat, Aug 13, 2022 at 05:27:34PM +:

> Have we ever considered a de jure *roff standard?

No, i think that would be pure madness given the amount of working
time available in any of the roff projects.

I expect the amount of effort required to be significantly larger
than the amount of effort that would be required for rewriting
the entire groff documentation from scratch because:

 1. You would have to study all features of all the major roff
implementations (groff, Heirloom, neatroff, Plan 9, and possibly
some others, maybe even historical ones) and compare the features.
For every difference (i.e. typically multiple times for almost every
feature), you would have to descide which behaviour to standardize
and what to leave unspecified.

 2. Discussions of the kind mentioned in item 1 are typically
lengthy and often heated.  If you don't believe me, just buy
several pounds of popcorn and watch the Austin list, where
maintenance of the POSIX standard is being discussed.
Even discussions of the most minute details tend to be
complicated and extended.

 3. Even after deciding what you want to specify, looking at the
manuals typically provides very little help because a
standard document requires a completely different style.
User and even reference documentation is optimized for clarity,
comprehensibility, and usefulness in practice; a standard document
needs to be optimized for formal precision, whereas
comprehensibility and conciseness matters much less.

 4. Even when you have the text - almost certainly after many years
of work by many people - be prepared for huge amounts of red
tape, like dealing with elected decision-making bodies of
professional associations, for example the IEEE.  Be prepared
for having to know things like what technical societies,
technical councils, and technical committees are, and how to
deal with each of them.  You are certainly in for a lot of
committee work, and i would count you lucky if you got away
without having to deal with lawyers, paying membership fees,
buying expensive standard documents you need for your work,
and so on and so forth.  Even when you submit a technically
perfect proposal, it will typically be rejected without even
being considered until you secure the official sponsorship
of at least one of the following: the IEEE, the Open Group,
or ISO/IEC JTC 1/SC 22.  Of course, your milage may vary
depending on what exactly you want to standardize and how,
but since roff(1) is arguably the most famous UNIX program,
i wouldn't be surprised if you were if for an uanbridged
POSIX-style Odyssey.

 5. The above is not helped by standards committee work being
typically conducted in ways that are technically ridiculously
outdated, and i'm saying that as an avid user of cvs(1) who
somewhat dislikes git(1) as overengineered and very strongly
detests GitHub.  Take the Austin groups as an example.  Most of
its work is changing the content of technical documents,
but the group *never* uses diff(1), never uses patch(1), and
never makes diffs available even after they have been approved.
They are very firmly stuck in the 1980ies regarding the technolgies
they are using and missed even most of the 1990ies innovations.
They do have some kind of version control system internally, but
no web interface of such version control ins publicly available,
nor any other public read-only access to that version control.
Even the source code of the finished version of the standard
is typically not made available to the public (at least not
without forcing people to jump through hoops).

> A standard could lead to more implementations because
> developers would not have to be intimately familiar with the
> {groff,heirloom,neatroff} toolchain before implementing a
> *roff toolchain themselves.

That's not even wishful thinking.  Better maintenance of the
existing implementations would be so much more useful than yet
another implementation.

> It could also lead to more users & use cases because existing
> users could count on systems supporting certain features, so
> they could use *roff in more situations, which would lead to
> more exposure.

You appear to massively overrate the importance end-users
typically attribute to standardization.  Even people *implementing*
a system rarely put such an emphasis on standardization.

> It’s ridiculous that *roff isn’t part of POSIX when it was Unix’s
> killer feature.

You are welcome to spend the many years required to change that.
But be aware that some standardization efforts that are part of
POSIX resulted in parts of the standard that are barely useable
for practical work.  One famous example is make(1).

Don't get me wrong: i think standardization is very nice to have,
should be taken very seriously when available, and provides 

Re: Using tbl(1) for structure definitions

2022-08-12 Thread Ingo Schwarze
Hi Alejandro,

Alejandro Colomar wrote on Fri, Aug 12, 2022 at 05:58:32PM +0200:

> Since I'm not 100% convinced by any of the ways to format structure 
> definitions in SYNOPSIS, I'm going to go for the status quo.  Since 
> there weren't any structure definitions (at least that I know of) in 
> SYNOPSIS before my introduction, I'll go for what was used in 
> DESCRIPTIONs.  That is embedding them in EX/EE, which uses monospace, 
> which allows me to align perfectly with spaces in any output device.

I consider that a good choice.

> I hope nobody will consider this very harmful.

While .EX may not be perfectly portable (being a v9+GNU extension),
i agree it is useful enough that using it often makes sense.
Btw., Branden sometimes asks for man(7) extensions that i do not
vilify; this is one.  :)

Besides, even if a formatter does not implement it (which won't
happen for many formatters), no content is lost.  Indentation is
lost in that case, but that's not a very serious problem because
C is not Pathon and line breaks and indentation do not matter for
C syntax and semantics.

> Still, I'm interested in your discussion about the best way to show 
> structured data like this, and the possible 
> portability/readability/accessibility issues of each alternative, so 
> this is just a temporary solution until we agree on something better (if 
> it exists).

I'm not sure whether a better way exists.  Maybe, maybe not.
Certainly tbl(1) isn't part of it.

Yours,
  Ingo



Re: Using tbl(1) for structure definitions

2022-08-12 Thread Ingo Schwarze
G. Branden Robinson wrote on Thu, Aug 11, 2022 at 04:46:12PM -0500:
> At 2022-08-11T15:47:38+0200, Ingo Schwarze wrote:
>> Alejandro Colomar wrote on Tue, Jul 26, 2022 at 10:09:44PM +0200:

>>> I must say that the source code is really ugly (ugly as in,
>>> someone reading it will probably have a hard time modifying it,
>>> without reading tbl(1)).

>> Completely true, but that's not the worst aspect of it.

> I disagree with Ingo's priorities here.  The readability of the
> source is more important for document maintainability.

True, both readability of the source code and rendering quality
matter, and discussing which one matters more in the case at hand
feels moot.

> As we shall see, tbl(1) need not discard as much as Ingo suggests,
> and even if it does (at present), I don't perceive quite the semantic
> damage he does.

I cannot imagine worse damage than reading an alt="" text containing
a non-descriptive filename to a blind reader.  Even with mandoc,
the damage of rendering a structure display as a  is very
severe for a blind reader.

>> In a nutshell, you are making it impossible to decently render
>> the manual page to HTML or to convert it to other formats in
>> any sensible way.
>> 
>> If esr@ (of doclifter fame) were still around, he would be screaming
>> in pain and disgust.

> We have an open bug report requesting a feature to have tbl emit HTML.
> 
> https://savannah.gnu.org/bugs/index.php?60052
> 
> Maybe someone would like to work on this.  The "troffcvt" suite already
> did this many years ago.

An argument that groff -T html could possibly support  output
from tbl(1) input if somebody did the work is hardly a justification
that manual page authors should behave as if it did *right now*.

Also, you ignored my observiation that even the mandoc -T html output
from a structure display using tbl(1) is very bad for accessibility,
and groff could hardly do better.  The reason is not that mandoc
tbl(7) to HTML conversion is bad but that a structure display *is
not tabular data*.

I really think this is a point we should try to find a consensus on
because using tbl(1) for structure display is such egregiously and
unambiguously bad advice that it would be very detrimental if even
part of the groff developers would continue promoting it.

[...]
>>> But at the same time, the result is beautiful,

>> Only in PDF and PostScript output.

> These output formats are _how typesetting is done_ in the modern era.
> 
> I know mandoc doesn't want to dirty its hands with such matters, but
> your militance about the unimportance of typesetting blinkers your
> perspective.

I don't think that is fair.  My argument here is only that a very
minor advantage for PDF and Postscript is not worth completely ruining
HTML output *for manual pages*.  (Besides, there is likely additional
fragility with processors without tbl(1) support, or with incomplete
tbl(1) support.)

I'm not talking about general-purpose typesetting here.

> groff cannot share that perspective.
> 
> "As the most widely deployed implementation of troff in use today, groff
> holds an important place in the Unix universe.  Frequently and
> erroneously dismissed as a legacy program for formatting Unix manuals
> (manpages), groff is in fact a sophisticated system for producing
> high-quality typeset material, from business correspondence to complex
> technical reports and plate-ready books." -- groff Mission Statement,
> 2014
> 
> https://www.gnu.org/software/groff/groff-mission-statement.html

Yes.  It is important that groff provides high-quality typesetting.

But that doesn't mean manual pages authors should go out of their
way to optimize typesetting quality and disregard considerations
for any other output format, or for portability and robustness.

[...]
> .TS
> tab(@);
> Lg(type) Lg(identifier) Lg(descriptive-comment).
> int@nflag;@/* ??? */
> .TE

Please don't abuse hypothetical language extensions to justify
using fancy features in manual pages that cause severe trouble
with currently existing software.

[...]
> Yet I would hasten to point out that a synopsis that presents something
> that is nowhere discussed later in the man page makes the document
> deficient.  So if you have semantic markup of all relevant content
> _after_ the synopsis, of which a well-written mdoc(7) page will surely
> boast, then little or nothing is lost in the domains of  searchability
> and discoverability.

This is a digression.  I would argue that the SYNOPSIS should
usually not contain structure displays.  Then again, this is a
page documenting a type, which we both consider a bad idea IIRC.
But none of that is related to the question whether using tbl(1)
for a structure display is a good idea.  If nothing is lost by
the structure n

Re: [PATCH 4/6] xattr.7: wfix

2022-08-12 Thread Ingo Schwarze
Hi Branden,

G. Branden Robinson wrote on Thu, Aug 11, 2022 at 03:17:14PM -0500:
> At 2022-08-11T14:48:51+0200, Ingo Schwarze wrote:
>> Alejandro Colomar wrote on Mon, Aug 01, 2022 at 03:28:03PM +0200:

>>> I'd like to arrive to some consensus on usage of \~ and '\ '.

>> In manual pages, always use "\ " and never use "\~", period.

> This is hugely overstated.

>> The former is portable and the latter is a GNU extension.

> ...that is over 30 years old and supported by Heirloom Doctools troff
> for 17 years now, neatroff for about six, and your mandoc for three.

Actually, mandoc supports \~ at least since Sep 17 2009:
https://cvsweb.bsd.lv/mandoc/Attic/chars.in?rev=1.1=text/x-cvsweb-markup

> For full disclosure, I'll acknowledge that Documenter's Workbench [DWB]
> https://archive.org/details/dwb-preprocessor-ref
> troff doesn't support it, but it doesn't seem to have been maintained
> for 30 years (Heirloom Doctools troff appears to be its
> descendant/successor).

I agree that missing support in DWB is a weak argument.  It is
unlikely that many people use it for practical work.  They would
likely suffer from more serious problems than \~, too.

> plan9port troff doesn't either, and its laudable introduction
> of a man(7) MR macro notwithstanding, its activity level is
> not high.

There are people using Plan 9 for practical work though, they have
even occasionally posted on the groff and mandoc lists, so that is a
bit more of a problem.

> I would pessimistically assume that most or all proprietary Unix
> troffs branched off from V7 Unix troff or early device-independent troff
> (maybe DWB 1.0 troff, ca. 1984 [?, 1]) lack support for `\~`.
> https://github.com/n-t-roff/Solaris10-ditroff/blob/master/troff/n1.c#L797

That does sound likely.  As an example, look at Oracle Solaris 11:

   > uname -a
  SunOS unstable11s 5.11 11.3 sun4u sparc SUNW,SPARC-Enterprise
   > printf "a~b\n" | nroff | head -n 1
  a~b
   > printf "a~b\n" | groff -T ascii | head -n 1
  a b

> I further note that groff has a long tradition of inclusion in BSD
> Unix, https://minnie.tuhs.org/cgi-bin/utree.pl
> ?file=Net2/usr/src/usr.bin/groff/VERSION

Yes.  Cynthia already considered dropping support for Kernighan's
troff, but the CSRG vetoed that.  Inclusion of groff wasn't
controversial even at a time when groff didn't have its own version
conrol yet.  Consequently, you are right that \~ is unlikely to cause
trouble on any BSD system.

> and despite the efforts of the mdocml/mandoc project to
> supplant or dispose of it groff in BSD's descendant communities, the
> underlying fact remains.  Giving up support for `\~` was therefore, in
> this sense, a regression, and one that took quite some time to address.

I don't think that anyone gave up support for \~.
But we have evidence that some never implemented support for it.

[...]
> As I recall, mandoc does not even support "full justification"
> (alignment of text to both left and right margins, with inter-word
> spaces expanded ["adjusted"] to achieve this) in the first place and
> there are no plans to.

Correct.

> mandoc can thus treat the two sequences as synonymous--

It does.  Mandoc maps all of \  \~ \0 to U+00A0.

> but that doesn't mean the `\~` escape sequence is a gratuitous alias
> or deviation from the norm.

No.  It is useful for general-purpose typesetting,
like many GNU extensions are.

>> portability is still significantly more important

> You are not quantifying anything.  Come on, can we at least get a
> Fermi estimation of the installed bases of the respective troff
> implementations and mandoc?

Frankly, i have no idea how to estimate the number of actively used
installations of Plan 9, Solaris (any version), and possibly
additional commercial systems like AIX and HP-UX, or how to check
what the latter support.

There might be more systems out there parsing manual pages (not
necessarily full-featured roff(7) implementations like those
you listed), but providing specific evidence of such systems
would likely be my job to back up my advice.  I'm not searching
for them right now because we already have a few relevant examples.

>> than such minute typographical details.

> For someone arguing from a standpoint of such slavish fidelity to 40
> year-old practices, you seem to be selective in the way you do it.

Admitted.  Sometimes, i do see the value of new features, even
when they are backward-incompatible.

> The Unix manual was always meant to be typeset.
> 
> "The manual was intended to be typeset; some detail is sacrificed on
> terminals." (man(1), _Unix Time-Sharing System Programmer's Manual_,
> Eighth Edition, Volume 1, February 1985)
> 
> At the time that statement was written, the sentiment was some 12 years
&

Re: Using tbl(1) for structure definitions

2022-08-11 Thread Ingo Schwarze
Hi Alejandro,

Alejandro Colomar wrote on Tue, Jul 26, 2022 at 10:09:44PM +0200:

> I must say that the source code is really ugly (ugly as in,
> someone reading it will probably have a hard time modifying it,
> without reading tbl(1)).

Completely true, but that's not the worst aspect of it.

In a nutshell, you are making it impossible to decently render
the manual page to HTML or to convert it to other formats in
any sensible way.

If esr@ (of doclifter fame) were still around, he would be screaming
in pain and disgust.

> But at the same time, the result is beautiful,

Only in PDF and PostScript output.

Did we really learn nothing collectively from the "do not abuse
tables for layout purposes" drama that raged for decades among HTML
authors and HTML standard developers?  Exactly the same applies to
manual pages.

In fact, the same arguments so very familiar from HTML apply to manual
pages even more. HTML tables can at least be imbued with some semantic
capabilities by using CSS, whereas tbl(1) tables are so deeply entrenched
in the "markup is presentational markup only" camp that they can never
hope to convey any semantic function at all.

> and the syntax is really great.
> You can express exactly what you want.

I think you need to revert all that madness of abusing tbl(1)
for alignment of structures.

Just say something like (caution, the following code contains
literal ASCII tab characters):

.nf
struct open_how {
u64 flags;  /* O_* flags */
u64 mode;   /* Mode for O_CREAT, O_TMPFILE */
u64 resolve;/* RESOLVE_* flags */
/* ... */
};
.fi

There is no need to use bold or italic donts in the structure
display.  Making all the C code bold merely makes the whole
display look heavy and ugly and provides no additional
information.  Making the comments use mixed font looks even
more ugly and is also redundant because the constants are
hopefully already more fully documented elsewhere.

Just use the same indentation conventions as you would
in a *.c or *.h file.

I don't think you need to worry that the alignment might
vary on different output devices.  If you worry anyway,
you can use an explicit roff(7) .ta request before the
display and reset it with .DT after the display.
Formatters that don't support .ta will just ignore it,
so it causes no harm, and groff and mandoc do support it.

Yours,
  Ingo



Re: [PATCH 4/6] xattr.7: wfix

2022-08-11 Thread Ingo Schwarze
Hi Alejandro,

Alejandro Colomar wrote on Mon, Aug 01, 2022 at 03:28:03PM +0200:

> I'd like to arrive to some consensus on usage of \~ and '\ '.

In manual pages, always use "\ " and never use "\~", period.
The former is portable and the latter is a GNU extension.

> What do you think?

I think you are massively overthinking this and the whole SI
argument is irrelevent for manual pages.  While the above concern
about robustness is minor, too (both groff and mandoc support \~),
portability is still significantly more important than such minute
typographical details.

Yours,
  Ingo



Re: TAB character in groff output

2022-08-11 Thread Ingo Schwarze
Hi Alejandro,

sorry for getting distracted and returning late to the party.

Alejandro Colomar wrote on Tue, Aug 02, 2022 at 05:14:47PM +0200:

[...]
> $ make lint-man-mandoc
> LINT (mandoc) tmp/lint/man7/spufs.7.lint-man.mandoc.touch
> mandoc: man7/spufs.7:748:7: WARNING: tab in filled text

My general recommendation for this warning is:

 * If the tab is used for a good reason (for example, if it is
   in a multi-line code sample that becomes more readable with
   good indentation), wrap the whole code sample in no-break
   mode.  In mdoc(7), that usually means .Bd -unfilled (if
   the sample uses markup) or .Bd -literal (otherwise).
   In man(7), .EX (more semantic) or .nf (more portable)
   can be used.

 * If the tab is not used for a good reason, just get rid of the tab.
   Quite often, that can be achieved in a very simple way.

In this case, it is blatantly obvious there is absolutely no reason
whatsoever for using a tab.

Arguably, the whole example should be deleted because it shows
nothing that is complicated enough to require an example.
All parts of the line are completely trivial.

Below "Mount options", a sentence is missing that in fstab(5), the
fs_spec field needs to be set to "none" and the fs_vfstype field to
"spufs" - most users would probably expect both anyway, but being
explicit is better.  I don't think the fs_freq and fs_passno need to
be mentioned, it is clear without saying so that only 0 makes sense
for "none" filesystems.

Remember, it is very bad style to provide EXAMPLES *instead* of
documentation because that leaves the user wondering which parts of
the example are crucial and which arbitrary (e.g., the /spu path),
and why the example was written as it was.

> In the following code:
> 
> $ sed -n 745,749p  .SH EXAMPLES
> .TP
> .IR /etc/fstab "  entry"

That's terrible style, too.  Manual pages should use complete sentences
and correct English punctuation, for reasons of both clarity and style,
for example:

.SH EXAMPLES
To automatically
.MR mount 8
the SPU filesystem when booting, at the location
.I /spu
chosen by the user, put this line into the
.MR fstab 5
configuration file:
.EX
none /spu spufs gid=spu 0 0
.EE

Just using single spaces is perfectly fine: KISS.

> I think I'll fix it with tbl(1).

That's a very bad idea: tbl(1) should be used very sparingly
and only when there is real tabular data und very strong reasons
to present it as a table.  Even when you actually need to present
tabular data, make an effort to avaoid bringing in tbl(1) if at
all possible.

The reason is that tbl(1) is a strongly presentational language
providing no semantic information whatsoever, with the usual
consequence of coming out beautifully in PDF output but usually
rendering *terribly* to HTML with groff and not very well even with
mandoc.  Besides, using tbl(1) in manual pages is more fragile and
less portable than using pure man(7) code.

Look at this fiasco:

   $ groff -t -man -Thtml man7/spufs.7
  [...]
  EXAMPLES
  
  
  /etc/fstab entry
  
  SEE ALSO

   $ mandoc -Thtml man7/spufs.7
  
  EXAMPLES
  
/etc/fstab entry

  

  none
  /spu
  spufs
  gid=spu
  0
  0

  

  
  
  

Imagine being blind and then consider both HTML code snippets from
the perspective of accessibility.

Yours,
  Ingo



Re: Journals typeset in [tg]roff

2022-08-05 Thread Ingo Schwarze
Hi Doug,

Douglas McIlroy wrote on Fri, Aug 05, 2022 at 06:17:41AM -0400:

> Physical Review was an early adopter of troff,

Oh, interesting.  I admit i submitted my own publicastions
in Phys Rev D in LaTeX in 2000 and didn't know at the time troff
would have been an option.  Then again, at the time, i wasn't
familiar with roff at all.

> but now invites submissions in either REVTeX or MS Word.
> A lot of math journals invite LaTeX.
> Do any journals out there invite [tg]roff?

Sorry, i can't answer that one.

Yours,
  Ingo



Re: TAB character in groff output

2022-08-02 Thread Ingo Schwarze
Hi Branden,

G. Branden Robinson wrote on Tue, Aug 02, 2022 at 10:42:45AM -0500:
> At 2022-08-02T15:44:21+0200, Ingo Schwarze wrote:

>> In groff, this works for me:
>> 
>>  $ printf "aN'9'b" | groff -T ascii | hexdump -C | head -n 1
>>  61 09 62 0a 0a 0a 0a 0a  0a 0a 0a 0a 0a 0a 0a 0a |a.b.|
>> 
>> Mandoc behaves differently and treats \N'9' exactly like a literal HT:
>> 
>>  $ printf "aN'9'b" | mandoc | hexdump -C | grep 61
>> 0050 61 20 20 20 20 62 0a 0a  20 20 20 20 20 20 20 20 |ab..|
>> 
>> In general, mandoc lets fewer control characters sneak through into
>> output than groff because i worry that control characters in output
>> might occasionally cause reliability or security issues.

> I don't predict high reliability from this technique

Heh.  :-)

I'm used to this statement in the mandoc_char(7) manual page:

  NUMBERED CHARACTERS
 For backward compatibility with existing manuals, mandoc(1)
 also supports the

   \N'' and \[char]

 escape sequences, inserting the character  from the
 current character set into the output.  Of course, this is
 inherently non-portable and is already marked as deprecated
 in the Heirloom roff manual; on top of that, the second form is
 a GNU extension.  For example, do not use \N'34' or \[char34],
 use \(dq, or even the plain ‘"’ character where possible.

So i assumed it is well-known to not be the pinnacle of portability.

But since

  https://www.gnu.org/software/groff/manual/html_node/Using-Symbols.html

documents \N without explicitely calling out its inherent portability
problems, maybe i should have mentioned the trap.

Then again, the wording

  "Typeset the glyph with code n in the current font ..."

does provide an *implicit* hint that this can hardly be expected to
be device-independent.

> when attempting it on a platform that uses IBM code page 1047
> as its input encoding. ;-)

I would have expected the *output* font numbering to cause even
more serious trouble than the *input* encoding.  Besides, not being
a masochist to that degree you appear to assume, i prefer this
counter-example:

  printf "aN'9'b" | groff -T pdf > tmp.pdf

After that, the file tmp.pdf displays three characters:

  The letter "a", the Euro-sign (oops!?), and the letter "b".

I expect we are soon going to dissuade Alejandro from his plan.  :)

Yours,
  Ingo



[bug #62814] consolidate or distinguish tty.tmac and tty-char.tmac

2022-08-02 Thread Ingo Schwarze
Follow-up Comment #5, bug #62814 (project groff):

[comment #4 comment #4:]
> While not as often a problem in practice these days, the troffrc file is not
editable by ordinary users, only by root.

As documented in the GNU troff(1) manual, that's trivial to solve for advanced
users:

Copy troffrc to ~/tmac/troff, edit it as desired, then use "troff -M ~/tmac"
or set GROFF_TMAC_PATH=~/tmac in the envoronment.


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




Re: TAB character in groff output

2022-08-02 Thread Ingo Schwarze
Hi Alejandro,

Alejandro Colomar wrote on Tue, Aug 02, 2022 at 02:14:58PM +0200:

> I'd like to be able to produce ASCII HT ('\t' - horizontal tab) in man 
> pages output.  I don't want to align things; I do want a tab character. 
> Rationale: examples in fstab(5).

I don't understand.  On Debian, fstab(5) is part of the "mount"
package - which seems very reasonable to me - and it says:

  Fields on each line are separated by tabs or spaces.

What's wrong with using spaces?

Wanting to show literal tab characters to users in a manual page
seems a dubious goal to me for two reasons:

 * They are visually indistinguishable from spaces, so if the
   distinction really matters, confusion is almost guaranteed to ensue.

 * Some users may use pagers that convert tabs to spaces or vice
   versa, so even if you hope for pastability, you still need
   luck for it to work as intended.


> Is that possible?  I didn't find anything in groff_char(7).

In groff, this works for me:

 $ printf "aN'9'b" | groff -T ascii | hexdump -C | head -n 1
  61 09 62 0a 0a 0a 0a 0a  0a 0a 0a 0a 0a 0a 0a 0a  |a.b.|

Mandoc behaves differently and treats \N'9' exactly like a literal HT:

 $ printf "aN'9'b" | mandoc | hexdump -C | grep 61
0050  61 20 20 20 20 62 0a 0a  20 20 20 20 20 20 20 20  |ab..|

In general, mandoc lets fewer control characters sneak through into
output than groff because i worry that control characters in output
might occasionally cause reliability or security issues.

Yours,
  Ingo



Re: All caps .TH page title

2022-08-02 Thread Ingo Schwarze
Hi Alejandro,

Alejandro Colomar wrote on Tue, Aug 02, 2022 at 12:58:41AM +0200:

> Would you mind disabling the following warning?:
> 
> mandoc: man3type/regex_t.3type:7:5: STYLE: lower case character
> in document title: TH regex_t

Yes.  Deleting that style recommendation is definitely one of the
steps that will be needed soon.  That needs coordination with at
least Jason McIntyre though because it might possibly disrupt his
work if it is deleted too early.

Eventually, a style recommendation might appear if the case of .Dt
disagress with the manual page name.  But not before all pages in
the base system have been fixed, obviously.


> Also, may I make you reconsider allowing one to disable specific 
> warnings?

I still seriously dislike that idea.  A mechanism to control
individual diagnostic messages seems easy enough on first sight,
but in practice, i have seen it turn into one of the inroads
eventually causing excessive complexity.  Execessive complexity in
message systems usually does not result from one single giant design
mistake, but instead from accumulating multiple small features that,
each regarded in isolation, all seem inoccuous and simple at first -
until they start to accumulate and interact.

To cite a groff example currently being discussed in parallel: it's
similar to already having a classicification system, but still adding
add-hoc controls (variable + if) for a single message that doesn't
seem to fit anywhere - instead of making the system more usable
and systematic without making it more complex.  :-/

Over the years, i have spent considerable amounts of work to make
the mandoc message system simple and systematic (in a certain
early phase, it did suffer from large numbers of small misdesigns,
as indeed the message systems of most projects do).  I do not
want to have all that work wasted.


> There are a lot of empty UR blocks in the Linux man-pages, 
> and I don't consider that a wrong thing.

Upon reconsideration, i came to the conclusion that you are right
about that point and fixed it with the commit appended below.
If you want to apply it locally, applying the man_validate.c
part only is obviously sufficient.


> Also, we use macros in tables, which mandoc(1) doesn't support
> (yet? never?),

It is on the TODO list, but among the most difficult entries on
the TODO list.  During the last few years, i estimate that about
two thirds of what needs to be done for that end has been achieved,
but the last third of the work still isn't easy.

> but that's not a big issue to the man-pages I maintain.

I'm still unsure what to do about that warning.  Just deleting
it is not an option because we clearly *do* want to tell people
about features unsupported by mandoc.

> Yet I want to lint the pages with mandoc(1) for other
> interetsting warnings.

One workaround you might possibly consider is treating
the exit status 4 from mandoc(1) as "success" (see the mandoc(1)
manual page for details).  Exit status 2, 3, and 5 and higher
are clearly errors, but 4 means that the most severe issue
found was "unsupported by mandoc", which might possibly make
sense to ignore for your purpose.

Then again, the downside of treating 4 as success is that it
will hide other errors and warnings the same page may contain,
and also that there may be some features that are unsupported
by mandoc and that you do want to avoid.

Another way out might be to define "-W linux" just like
we already define "-W openbsd" and "-W netbsd" (again, see
the mandoc(1) manual page for details) and let this level
suppress the UNSUPP message about macros inside tables.
I'm not yet sure whether that would be a good idea.

> Is that too hard to implement?

Difficulty of implementation isn't the only reason for rejecting
a feature - if you add every feature that seems easy to implement,
you eventually die a painful death from featuritis.

Yours,
  Ingo


Log Message:
---
If the body of a man(7) .MT or .UR block is empty, do not emit a warning.
Leaving the body empty is legitimate in this case if the author only
wants to display a mail address or URI without providing a link text.
Output modules already handle this correctly: terminal output shows
just the URI without an accompanying text, HTML output uses the URI
for *both* the href= attribute and as the content of the  element.

The documentation was also wrong and claimed that an .MT or .UR block
with an empty body would produce no output.  As explained above,
this isn't true.

Bogus warning reported by
Alejandro Colomar .

Modified Files:
--
mandoc:
man_validate.c
mandoc.1
mandoc/regress/man/MT:
args.out_lint
mandoc/regress/man/UR:
args.out_lint

Revision Data
-
Index: args.out_lint
===
RCS file: /home/cvs/mandoc/mandoc/regress/man/MT/args.out_lint,v
retrieving revision 1.3
retrieving revision 1.4
diff -Lregress/man/MT/args.out_lint 

[bug #62776] [troff] add optional diagnostic for sentences ending mid-input line

2022-08-01 Thread Ingo Schwarze
Follow-up Comment #3, bug #62776 (project groff):

Not an objection, just for your consideration.

We already have a method for enabling, disabling, and selecting diagnostic
messages.
Please remember that organizing diagnostic messages is among those areas most
prone to code sprawl and over-engineering,
hence a clean and *uniform* design is even more beneficial in that area than
elsewhere.

Consequently, my gut feeling isn't enthusiastic about adding a second
mechanism for controlling diagnostic messages.


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




Re: Why are tty-char.tmac and tty.tmac separate files?

2022-08-01 Thread Ingo Schwarze
Hi Branden,

G. Branden Robinson wrote on Sun, Jul 31, 2022 at 05:10:01PM -0500:
> At 2022-07-31T20:02:52+0200, Ingo Schwarze wrote:
>> G. Branden Robinson wrote on Sat, Jul 16, 2022 at 12:05:55AM -0500:

>>> I added fallbacks in tty.char for \[fm] and \[sd] (both CSTR #54
>>> glyphs) in May 2021.  I seem to remember that Ingo followed suit
>>> at least for the latter in mdoc.

>> mandoc renders as follows:
>> 
>> input:\(fm
>> -T ascii: U+0027 APOSTROPHE-QUOTE
>> -T utf8:  U+2032 PRIME
>> 
>> input:\(sd
>> -T ascii: U+0022 QUOTATION MARK
>> -T utf8:  U+2033 DOUBLE PRIME
>> 
>> The latest related commits are:
>> 
>> mandoc/chars.c revision 1.51
>> date: 2022/06/26 20:30:00;  author: schwarze;  state: Exp;  lines: +2 -2;
>> In groff commit 78e66624 on May 7 20:15:33 2021 +1000,
>> G. Branden Robinson changed the -T ascii rendering
>> of \(sd, the "second" symbol, U+2033 DOUBLE PRIME, from '' to ".
>> Follow suit in mandoc.
>> 
>> mandoc/chars.in revision 1.24
>> date: 2014/10/29 03:34:26;  author: schwarze;  state: Exp;  lines: +20 -20;
>> Some fine tuning of console rendering of named special characters.
>> Correct ASCII rendering: \(lb \(<> \(sd# <=== look here ===<
>> Make ASCII rendering agree with groff, using backspace overstrike:
>> \(da \(ua \(dA \(uA \(fa \(c* \(c+ \(ib \(ip \(/_ \(pp \(is \(dd \(dg
>> 
>> Essentially, rev. 1.24 changed " to '' to agree with groff.

> In 2014, I point out, seeking furiously to escape blame for churn...

:-)

>> That was reverted by Branden in 2021 and i followed again,
>> even though with a significant delay caused by lazyness on my part.
>> 
>> The mandoc ASCII rendering of \(fm has been stable since it was
>> first supported in 2009.

> There's just no way rendering \(sd the same as \(fm was right.

Neither groff nor mandoc ever did such a thing,
and i did not intend to imply such a claim.

AFAIK, mandoc and groff -T ascii renderung of \(fm
always was *single* U+0027 APOSTROPHE-QUOTE = '

-T ascii renderung of \(sd was:
mandoc 2009-2014:  U+0022 QUOTATION MARK = "
mandoc 2014-2022:  *double* U+0027 APOSTROPHE-QUOTE = ''
groff until 2021:  *double* U+0027 APOSTROPHE-QUOTE = ''
groff since 2021:  U+0022 QUOTATION MARK = "
mandoc since 2002: U+0022 QUOTATION MARK = " again

> In the U.S., with our antiquated system of weights and measurements,
> it is still common to represent measurements like overpass clearances
> on freeways with signs saying things like
> 
>   11'8"
> 
> ...a length I do not choose at random, but in homage to a source of
> immense, dark entertainment, as "American" as it gets.
> 
> http://11foot8.com/

Maybe if the signs said:  11"+8" (= 19" = 48cm)
truck drivers would be more wary and less often try to sneak past below
during some instant when the bridge is not looking.  ;)

> Indeed I know that very location, having lived in Durham, NC for
> about a year and a half once.

Heh.  I hope that bridge didn't give you a haircut, then.

> And of course these symbols are still used globally in the
> degrees-minutes-seconds representation of angle measures.

Yours,
  Ingo



Re: the compatibility of man(7) (was: man(7) .TH font change, was: groff man(7) `B` macro...)

2022-08-01 Thread Ingo Schwarze
Hi Alejandro,

Alejandro Colomar wrote on Mon, Aug 01, 2022 at 03:47:19PM +0200:

> Maybe adding "(since groff 1.23.0)" next to the description of MR in the 
> groff manual should be enough to trigger some doubts in programmers.

Putting compatibility information and in particular version numbers
in the middle of the DESCRIPTION is slightly unusual, but in a drastic
case like this, it might actually help to alert authors to situations
where using the new feature is premature.
If all goes well and the new feature is taken up by most systems
after a number of years, we should not forget to relegate it to the
HISTORY section it will then belong, in order to avoid littering the
main text of the description.

[...]
> Checking which versions of a program are packaged for different 
> distros/OSes should be trivial in most cases.

True, for example with repology.org and similar sites.
So "since" together with the groff version number does sound helpful.

[...]
> I think the think that will save us is that people usually don't even 
> know that groff exists or that it's used internally by man(1) to render 
> the pages.  So, programmers are unlikely to run `man groff_man`, and 
> instead will go for `man 7 man`, which will not talk about MR.

Oh, the man(7) from the man-pages package.  That's probably
another partial mitigation, yes.

The man(7) shipped with mandoc(1) will mention .MR once it is
implemented there, but not in the MACRO OVERVIEW and only in
the MACRO REFERENCE, similar to the other GNU extensions
like .UR and .TQ, marking it as a Plan9+GNU extension,
probably also mentioning "since groff 1.23.0" as you suggest.

Yours,
  Ingo



[bug #62814] consolidate or distinguish tty.tmac and tty-char.tmac

2022-08-01 Thread Ingo Schwarze
Follow-up Comment #3, bug #62814 (project groff):

Please don't overengineer it by defining registers and adding .if statements
and the like.

tty.tmac is called from troffrc.
One simple idea would be to just call tty-char.tmac from the same place.
So power users who don't want it by default can simply comment it out and call
-m tty-tmac manually in those cases where they do want it.


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




Re: the compatibility of man(7) (was: man(7) .TH font change, was: groff man(7) `B` macro...)

2022-07-31 Thread Ingo Schwarze
Hi Branden,

G. Branden Robinson wrote on Sat, Jul 16, 2022 at 07:37:54AM -0500:
> At 2022-06-19T19:28:26+0200, Ingo Schwarze wrote:

>> Given that the base system corpora of FreeBSD and NetBSD are of the
>> same order of magnitude as in OpenBSD (and treating DragonFly as a
>> FreeBSD fork for now, which is not fair in general, but good enough
>> for this particular purpose because Dragonfly is likely to evetually
>> merge FreeBSD manual page improvements, even if with a long delay)
>> the estamited total number of mdoc manual pages might be about 15k,
>> with each of the major BSD systems contolling rougly 20-25% each, and
>> third-party software controlling maybe on the order of 20-40%.
>> 
>> To summarize, the *BSD base systems very likely collectively control
>> the majority of existing mdoc(7) manual pages, whereas the Linux man
>> pages project likely controls on the order of 1% of the existing
>> man(7) pages - even when we consider neither commercial UNIX systems
>> nor proprietary software.
>> 
>> That makes compatibility in man(7) significantly more of a concern
>> than in mdoc(7).  All the same, i would certainly not consider
>> adding anything as disruptive as .MR to mdoc(7).

> This argument amounts to saying that because the BSDs control both the
> majority of the extant mdoc(7) pages in the world and the mandoc(1)
> renderer that has been made the default on those systems (at the expense
> of all troffs),

Not exactly "at the expense"; the systems continue providing,
recommending, and using GNU roff for serious typesetting business,
just not for the everyday task of formatting manual pages for the
terminal, nor for the purpose of serving manual pages on the web.

> then the mdoc(7) language can be allowed to evolve.

Actually, i did not yet introduce a single new feature into the
mdoc(7) language that is seriously backward-incompatible unless
i'm forgetting about something.  Oh well, we dropped the requirement
to obey the historical eight-argument-limit about a decade ago
because it was very painful for manual page authors and seriously
anachronistic.
Admittedly, the reason for refraining from backward incompatibility is
not only that even in *BSD, a seriously backward-incompatible feature
would cause disruption for many years to come (even if not quite
as bad as in man(7)), but also that while the mdoc(7) langauge
has a few minor defects, none of them are so severe that they really
necessitate backward-incompatible changes.  There have been significant
amounts of deprecentation aiming to reduce the amount to which some
of the less-well-designed features are used.  Also, i added one new
macro to the mdoc(7) language, but in a fully backward-compatible
way.

> It furthermore implies that because the groff project does not control a
> large proportion of the world's man(7) pages (though I have striven to
> make what we do provide exemplary), the man(7) language _cannot_ be
> allowed to evolve.

My opinion would indeed be that it would be better to avoid evolving
it in a backward-incompatible way and add backward-compatible
extensions in a cautious manner.  There may be exceptions; for
example, i would likely have supported the addition of the .UR
macro had i already been around groff back then because it added
a stubstantial new feature for information that typically wasn't
included in manual pages at all before it was invented, so only
being partially backward-compatible was maybe acceptable in this
case.  Yet, an equivalent design with better compatibility would
probably have been possible then, too.

> I do not accept this proposition.  You are insisting upon stagnation.

That isn't my intention.  Since that is your interpretation all
the same, we can maybe agree to disagree in this respect.

> Yes, distributors that incorporate packages with man pages using the new
> `MR` macro, racing ahead of those same distributors' update of groff
> 1.23.0 (whenever that happens), run the risk of some unhappy man page
> readers.

As i explained earlier, people doing packaging will rarely be
aware what the technical requirements for formatting any given
manual page are, and when i watch packagers, i don't usually see
them scrutinizing manual pages for good formatting.  And again, the
choice of using .MR will not be made by *any* packager, but by the
upstream project.  So the packages would only have the choice to
delay the package update, waiting for a typically *different*
packager to update groff.  It is not clear delaying a software
update for such a reason is even a good idea.

> If a distributor takes a wait-and-see attitude toward groff 1.23.0,
> whether deliberately or because their package maintainer is comatose,
> then another contributor who doesn't have a great command of *roff can
> apply the following patch to

Re: mdoc(7) prologue regressions

2022-07-31 Thread Ingo Schwarze
Hi Branden,

G. Branden Robinson wrote on Sat, Jul 16, 2022 at 02:21:49AM -0500:
> At 2022-06-27T00:29:08+0200, Ingo Schwarze wrote:

>> The first issue i identified is a group of regressions in the
>> behaviour of the mdoc(7) prologue macros .Dt and .Os.
>> The regressions aren't particularly severe because all that i found
>> so far only trigger when the document uses these macros incorrectly.
>> All the same, i'd like to report them such that we can decide
>> whether we want to fix some or all of them.
>> 
>> I suspect that this commit might be responsible but admit
>> that i did not prove this suspicion by testing right before
>> and right after the commit.  I only tested that the behaviour
>> changed as described below from groff-1.22.4 to groff-current:
>> 
>>   commit a1e6c19176d38823d8dc6c9a619a493ca90bdca4
>>   Author: G. Branden Robinson 
>>   Date:   Sun Oct 3 23:15:12 2021 +1100
>> 
>>   [andoc,man,mdoc]: Fix Savannah #61266.
>> 
>>   Resolve problems in batch rendering of man pages to PDF arising from
>>   entanglement of end-of-input traps, page location traps, continuous
>>   rendering mode, and andoc's reloading of the (m)an and (m)doc packages.
>>   [...]

> That commit was an immense pain to get "right".  As I feared, my words
> from later in the commit message have come back to haunt me.
> 
>  Refactoring is needed: some macros and registers have misleading names,
>  there is some code duplication in mdoc, and some of the trap management
>  problems are solved in slightly different ways in man(7) and mdoc(7),
>  perhaps unnecessarily.  We also need some test scripts to protect us
>  from regressions.  But this fixes the rendering problems.
> 
> I didn't do the regression tests.  But it probably would not have
> occurred to me at the time to test the incorrect usage modes of the
> mdoc(7) macros.
> 
> For all of these issues, I have the same pair of questions:
> is that a regression or just a difference?

That's a matter of interpretation.

I might call it a regression if the old behaviour is clearly
more useful than the new, and a mere difference otherwise.

> Is there a specification for this scenario?

No.  The closest we have to a "specification" for mdoc(7) is
the manual pages in the mandoc and groff repositories, and
mdoc(7) for example saya:

  The prologue, which consists of the Dd, Dt, and Os macros in that
  order, is required for every document.

So forgetting or repeating them or mixing up their order is clearly
invalid syntax.  Still, it might be useful to handle such syntax
errors gracefully.  If we decide which behaviour we want, i can
fix up the mandoc code and the mandoc testsuite accordingly.
I expect changes in mandoc will be significantly easier than in
groff because coding in C is usually *much* easier than writing
code for identical funxtionality in roff(7).

>>  1. When there are two .Dt macros in the prologue, the last one used
>> to win, setting the page title, section number, and section title.
>> Now, the first one wins, setting these fields.

Not sure whether either of these is better.  Maybe the question to
ask is: which one works better with having multiple manual pages
in the same input file?  Not sure.

Would there be value in having duplicate .Dt behave in a way similar
to duplicate .TH?  Not sure either.

>>  2. When a .Dt macro occurs in the body of the page (as opposed to
>> in the prologue), it used to be ignored.  Now, it causes a
>> large number of blank lines in the output.

I would call that one a regression because the new behaviour is
clearly not useful and likely annoying for the user.

>> Both issue 1 and issue 2 can be seen with this test file:
>> https://cvsweb.openbsd.org/cgi-bin/cvsweb/~checkout~/src/regress/usr.bin/mandoc/mdoc/Dt/dupe.in?rev=1.4=text/plain
>> 
>>  3. When the first .Dt macro comes late, the page title used to be
>> set to "UNTITLED".  Now, it is set to the empty string.

I would also call that a regression because UNTITLED is more effective
for alerting the author that something is amiss, even if they failed
to notice the related error message.

>> Both issue 2 and issue 3 can be seen with this test file:
>> https://cvsweb.openbsd.org/cgi-bin/cvsweb/~checkout~/src/regress/usr.bin/mandoc/mdoc/Dt/late.in?rev=1.2=text/plain
>> 
>>  4. If there is no .Dt macro at all, the page title used to be
>> set to "UNTITLED".  Now, it is set to the empty string, see:
>> 
>> https://cvsweb.openbsd.org/cgi-bin/cvsweb/~checkout~/src/regress/usr.bin/mandoc/mdoc/Dt/missing.in?rev=1.2=text/plain

Dito, old behaviour seems more useful to me.

>>  5. When the usual order of .Dt 

Re: Why are tty-char.tmac and tty.tmac separate files?

2022-07-31 Thread Ingo Schwarze
Hi Branden,

G. Branden Robinson wrote on Sat, Jul 16, 2022 at 12:05:55AM -0500:

> I added fallbacks in tty.char for \[fm] and \[sd] (both CSTR #54
> glyphs) in May 2021.  I seem to remember that Ingo followed suit
> at least for the latter in mdoc.

mandoc renders as follows:

input:\(fm
-T ascii: U+0027 APOSTROPHE-QUOTE
-T utf8:  U+2032 PRIME

input:\(sd
-T ascii: U+0022 QUOTATION MARK
-T utf8:  U+2033 DOUBLE PRIME

The latest related commits are:

mandoc/chars.c revision 1.51
date: 2022/06/26 20:30:00;  author: schwarze;  state: Exp;  lines: +2 -2;
In groff commit 78e66624 on May 7 20:15:33 2021 +1000,
G. Branden Robinson changed the -T ascii rendering
of \(sd, the "second" symbol, U+2033 DOUBLE PRIME, from '' to ".
Follow suit in mandoc.

mandoc/chars.in revision 1.24
date: 2014/10/29 03:34:26;  author: schwarze;  state: Exp;  lines: +20 -20;
Some fine tuning of console rendering of named special characters.
Correct ASCII rendering: \(lb \(<> \(sd# <=== look here ===<
Make ASCII rendering agree with groff, using backspace overstrike:
\(da \(ua \(dA \(uA \(fa \(c* \(c+ \(ib \(ip \(/_ \(pp \(is \(dd \(dg

Essentially, rev. 1.24 changed " to '' to agree with groff.
That was reverted by Branden in 2021 and i followed again,
even though with a significant delay caused by lazyness on my part.

The mandoc ASCII rendering of \(fm has been stable since it was
first supported in 2009.

Yours,
  Ingo



[bug #62814] consolidate or distinguish tty.tmac and tty-char.tmac

2022-07-31 Thread Ingo Schwarze
Follow-up Comment #1, bug #62814 (project groff):

Arguably, the definitions in tty.tmac are of such a high quality that any user
of terminal output wants to use term unconditionally, and warnings about
characters in the input document that need any of these fallbacks would make
little sense.

Most of the fallbacks in tty-char.tmac, on the other hand, are arguably
desperate attempts to somehow render the respective characters even when good
renderings don't really exist, and most are of dubious or even poor quality. 
Some users might possibly prefer warnings about unavailable characters to poor
or cryptic renderings, and hence decide to not load tty-char.tmac, in
particular if they consider changing the input file to more gracefully render
to ASCII.

That said, while i see why some users might want to *not* load tty-char.tmac
for some uncommon purposes, personally i never felt a need to suppress its
loading for my purposes.  Quite to the contrary, it happened several times
that i simply forget loading it, resulting in badly degraded output.  So i
think most users would be better served by loading it by default.

If you want to be extra careful, you could leave these in their own file
though, such that people who want to scutinize a document for characters that
render badly to ASCII can more easily do so by disabling loading the file in
one way or another.


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




Re: man -M tcl (was: All caps .TH page title)

2022-07-29 Thread Ingo Schwarze
Hi Alejandro,

Alejandro Colomar wrote on Fri, Jul 29, 2022 at 02:03:51PM +0200:

> BTW, I guess you also have the POSIX man pages in BSDs.  Do they come 
> from the kernel repo that I maintain, or do you have your own separate 
> repos?

  $ less /usr/ports/books/man-pages-posix/Makefile
 [...]
 COMMENT =POSIX manual pages
 DISTNAME =   man-pages-posix-2013-a
 MASTER_SITES=https://www.kernel.org/pub/linux/docs/man-pages/man-pages-posix/
 EXTRACT_SUFX=.tar.xz
 [...]
 DOCDIR = ${PREFIX}/share/doc/posix
 [...]
 # mapping of categories: source => destination
 MANS =  0p  3
 MANS += 1p  1
 MANS += 3p  3
 [...]

I don't know off the top of my head what FreeBSD and NetBSD ports do,
but you can no doubt look it up if you are interested.

> I'd like to discuss about the best place to recommend putting manual pages.
> 
> Do you know if any projects (Tcl and Tk maybe) are already using a 
> specific path for man pages?

 $ pkg_locate /man/man | grep -v :/usr/local/man | \
   sed 's/[^\/]*\/[^\/]*$//' | sed -E 's/(.*):(.*):(.*)/\3:\1/' | \
   sort | uniq > tmp.txt
 $ vi tmp.txt  # minimal manual cleanup
 $ cat tmp.txt
/usr/local/cyrus/man/:cyrus-imapd-3.4.4
/usr/local/heirloom-doctools/man/:heirloom-doctools-191015p0
/usr/local/jdk-1.8.0/man/:jdk-1.8.0.332.b09.1v0
/usr/local/jdk-11/man/:jdk-11.0.15.10.1v0
/usr/local/jdk-17/man/:jdk-17.0.3.7.1v0
/usr/local/lib/eopenssl/man/:openssl-1.0.2up4
/usr/local/lib/eopenssl11/man/:libretls-3.5.2
/usr/local/lib/eopenssl11/man/:openssl-1.1.1q
/usr/local/lib/eopenssl30/man/:openssl-3.0.5
/usr/local/lib/erlang21/man/:erlang-21.3.8.24v0
/usr/local/lib/erlang21/man/:erlang-wx-21.3.8.24v0
/usr/local/lib/node_modules/npm/man/:node-16.15.1v0
/usr/local/lib/ruby/gems/3.1/gems/kramdown-2.3.1/man/:ruby31-kramdown-2.3.1
/usr/local/lib/stk/4.0.1/man/:STk-4.0.1p19
/usr/local/lib/swipl-7.6.0/xpce/prolog/lib/:swi-prolog-7.6.0p15
/usr/local/lib/tcl/tcl8.5/man/:tcl-8.5.19p6
/usr/local/lib/tcl/tcl8.6/man/:tcl-8.6.12
/usr/local/lib/tcl/tk8.5/man/:tk-8.5.19p2
/usr/local/lib/tcl/tk8.6/man/:tk-8.6.12
/usr/local/plan9/man/:plan9port-20210323
/usr/local/riscv32-esp-elf/share/man/:riscv32-esp-elf-binutils-2.35.1.2020.1223
/usr/local/riscv32-esp-elf/share/man/:riscv32-esp-elf-gcc-8.4.0.2021.2
/usr/local/riscv32-esp-elf/share/man/:riscv32-esp-elf-gdb-2.35.1.2021.2
/usr/local/share/doc/posix/man/:man-pages-posix-2017a
/usr/local/share/docbook2X/xslt/:docbook2x-0.8.8p5
/usr/local/share/fish/man/:fish-3.4.1p3
/usr/local/share/libowfat/man/:libowfat-0.32p0
/usr/local/share/man/:smplayer-22.2.0
/usr/local/xtensa-esp32-elf/share/man/:xtensa-esp32-elf-binutils-2.35.1.2020.1223p0
/usr/local/xtensa-esp32-elf/share/man/:xtensa-esp32-elf-gcc-8.4.0.2021.2
/usr/local/xtensa-esp32-elf/share/man/:xtensa-esp32-elf-gdb-2.35.1.2021.2p0
/usr/local/xtensa-esp32s2-elf/share/man/:xtensa-esp32s2-elf-binutils-2.35.1.2020.1223
/usr/local/xtensa-esp32s2-elf/share/man/:xtensa-esp32s2-elf-gcc-8.4.0.2021.2
/usr/local/xtensa-esp32s2-elf/share/man/:xtensa-esp32s2-elf-gdb-2.35.1.2021.2
/usr/local/xtensa-esp32s3-elf/share/man/:xtensa-esp32s3-elf-binutils-2.35.1.2020.1223
/usr/local/xtensa-esp32s3-elf/share/man/:xtensa-esp32s3-elf-gcc-8.4.0.2021.2
/usr/local/xtensa-esp32s3-elf/share/man/:xtensa-esp32s3-elf-gdb-2.35.1.2021.2
/usr/local/xtensa-lx106-elf/share/man/:xtensa-lx106-elf-binutils-2.32p0
/usr/local/xtensa-lx106-elf/share/man/:xtensa-lx106-elf-gcc-10.2.0p3

I can't say so far if those paths are the default paths upstream chose
or in how many cases the OpenBSD porter chose them instead.
Finding out requires looking at each of these about 35 ports
individually.

> I think something under $docdir would be a nice place.
> 
> The FHS mentions[1] .
> GNU specifies[2] that $docdir should be  
> for a  prefix.
> 
> So they seem to agree in where $docdir lives.  Then we could make the 
> pkg-specific mandirs be .
> 
> What are your thoughts?

Yes, even though /usr/local/share/doc/pkgname/man/man* is a bit long,
it makes more sense than paths like

  /usr/local/cyrus/man/
  /usr/local/heirloom-doctools/man/
  /usr/local/lib/erlang21/man/
  /usr/local/lib/node_modules/npm/man/
  /usr/local/lib/stk/4.0.1/man/
  /usr/local/lib/tcl/tcl8.6/man/
  /usr/local/plan9/man/
  /usr/local/share/fish/man/

Then again, *if* we go the -M alias way, these paths are only
ever used in the man.conf(5) file.  So where exactly they are
has no major impact on the user and is more a matter of system
cleanliness.

Yours,
  Ingo



Re: man(7) .TH font change (was: groff man(7) `B` macro)

2022-07-27 Thread Ingo Schwarze
Hi Branden,

G. Branden Robinson wrote on Sat, Jul 16, 2022 at 10:38:47AM -0500:
> At 2022-06-18T21:05:09+0200, Ingo Schwarze wrote:

>> The header line does not contain a cross reference, so there is no
>> justification for marking it up in the same way as a cross reference.

> I think there is, because that way the man page you are looking for
> announces itself to your eyes in precisely the same typefaces you saw
> when you read a cross reference to it.  I think this eases recognition
> and constitutes better ergonomics, but that's my opinion.

I don't think the congruence between .TH and .MR you describe really
matters.  The name(number) syntax stands out so much in English text
that is is instantly recognizable no matter the typeface.

If you really think the congruence matters, i understand your
argument even less.  In mdoc(7), this was always consistent:
both .Dt and .Xr always render in roman font.  By contrast,
man(7) was much less consistent in the past:

 - man(7) always used roman font for .TH
 - For cross references, AT UNIX Sixth Edition used R(R)
   inside SEE ALSO and an inconsistent mixture of R(R), I(R),
   and I(I) outside SEE ALSO.
 - For cross references, AT UNIX Seventh Edition used R(R)
   inside SEE ALSO and I(R) outside SEE ALSO (as discussed on
   this list around August 4, 2021).
 - various man(7) pages from various projects also used bold face
   for cross references.

You say it matters that .TH appears in the same font as the .MR it
came from?  Well, that won't be the case when linking from mdoc(7) to
man(7) pages nor when linking from man(7) to mdoc(7) pages with your
changes.  The consistency in the header lines even decreases.

[...]
>> Finally, we are talking about a header line in the page margin.  This
>> is *not* something that should be emphasized by using italic or bold
>> font.

> I have to wonder where you get your evidence from sometimes.

Ouch.  I'm sorry you spent so much time on this claim.

[...]
> "Most".  Describe your method of measurement.

Looking at a smaller sample of books on my shelf, and finding only
Stroustrup as a counter-example.  It appears my sample was too small
and my claim that using bold or italic typefaces in header lines
is unusual is not true (I should really know Poisson statistics
better *blush*).  Your larger sample proves that practice varies
widely.

>> You might say: "Why do you bother?  You are going to set \*(MF
>> to R anyway on *BSD.  So it makes no difference to you."
>> And indeed, *if* you push through with .MR, that's exactly
>> what i will do in groff on OpenBSD and in mandoc on all *BSDs,
>> and in mandoc, MF=R will not even be optional but hard-coded.

> Okay.  It's free software.  I don't think mandoc's users will be
> distressed; you already deny them hyphenation

True.

> and a configurable line length.

No longer true; mandoc supports both the -O width= argument,
and even lets the manual page override that by the roff(7) .ll
request.  If the terminal window is narrower than 80 columns,
it even reduces the default line length automatically, with no
need for -O width=.

[...]
>> All the same, i prefer sane defaults over excentric defaults
>> that need to be patched away, and i prefer common conventions
>> to markup fragmentation.  Surely you don't expect the font
>> conventions for mdoc(7) .Xr and .Dd to change now, after
>> three decades in production,

> I don't expect _you_ to change it.

Well, i'm also thinking about groff_mdoc(7).

I usally strive for mandoc(1) -T ascii and -T utf8 output to be
byte-for-byte identical to groff output.

But when output from groff_man(7) is inconsistent with output from
groff_mdoc(7), i usually decide to let mandoc(1) follow the mdoc(7)
conventions for both languages, for consistency, and let the mandoc
implementation of man(7) diverge from groff_man(7) so far as that
cannot be helped.  And then i patch the OpenBSD groff port to be
consistent with itself and with mandoc, even if that makes it diverge
from upstream groff.

[...]
>> You are aware that the syntax and semantics of .MR is completely
>> identical to the .Xr macro that Cynthia invented 30 years ago?

> How is that a bad thing?  You spent another prong of this thread
> derogating its design intensely.

If you were designing man(7) from scratch, it would be a good design.

I criticised the design of .MR for its lack of backward compatibility,
not for any *intrinsic* design problems.

>> Making it render differently also looks like a dubious choice
>> to me, in addition to the topic of this mail.

> The sources of your doubt seem to be to be your personal preference
> backed up by unfounded assertions that are readily contradicted by
> observation.

True, you can consider my rash claim that bold and italic fonts
it header lines are unusual dispr

Re: All caps .TH page title

2022-07-27 Thread Ingo Schwarze
Hi Branden,

G. Branden Robinson wrote on Sun, Jul 24, 2022 at 10:44:47AM -0500:
> At 2022-07-24T16:57:19+0200, Ingo Schwarze wrote:

>> But dismissing decade-old *BSD standards like the use of /usr/ for the
>> base system and /usr/local/ for packages as a standard violation, and
>> promoting /opt/ which is firmly a Linux-only invention,

> Oh, no it's not.  I remember that thing from Solaris 2.3 or 2.4.
> Here's a slightly later source.
> 
> https://docs.oracle.com/cd/E19455-01/805-6331/fsadm-17/index.html

Oops, thanks for setting me right.

Confirmed:

   > uname -a
  SunOS unstable11s 5.11 11.3 sun4u sparc SUNW,SPARC-Enterprise
   > ls -al /opt
  total 2613
  drwxr-xr-x 12 root other  13 Dec 31 2020 .
  drwxr-xr-x 19 root root   22 Aug 17 2018 ..
  drwxr-xr-x  4 root other   4 Feb 10 2015 bop
  drwxr-xr-x 25 root bin29 Dec  1 2017 csw
  drwxr-xr-x 10 root sys11 Aug 17 2018 developerstudio12.5
  drwxr-xr-x 10 root sys11 Aug 17 2018 developerstudio12.6
  drwxr-xr-x  3 root root3 Feb 10 2015 local
  drwxr-xr-x 12 root sys12 Jan 22 2015 solarisstudio12.3
  drwxr-xr-x 10 root sys11 Dec 22 2015 solarisstudio12.4
  drwxr-xr-x 13 root sys13 Jan 22 2015 solstudio12.2
  drwxr-xr-x 13 root sys15 Oct 29 2015 SUNWspro
  -rw---  1 root root  1311633 Oct 29 2015 uninstall_Sun_Studio_12.class
  drwxr-xr-x  3 root root3 Feb 18 2015 VRTS

It doesn't look as if modern Oracle Solaris uses it very extensively,
but still, it does exist.

[...]
> Under this umbrella, the Linux kernel is effectively under the BSD
> license.

Except that free software projects cannot copy from it - that's
quite a big BUT since allowing *everybody* to copy the code for
any purpose is the central idea of the BSD license.  ;-)

[...]
> The BSD camp did ultimately win the copyleft argument after all.

I'm not so sure about that.  The idea of the BSD license is to
allow all uses that can be licensed to others according to the Berne
Convention, retaining only those rights - essentially the moral rights,
like being known as the author, and abuse of the Works for slandering
the author being prohibited - that are inalienable in the first place
according to the Berne Convention.

Even if in effect, the Copyleft aspect of the GPL is not usually
enforced against Foundation members, GPL code is far from as free
as the Berne Convention would permit it to be, and far from as free
as if it were under a BSD license.

So essentially, you say that in practice, the GPL fails to attain
the goals RMS designed it for, and i say that all the same, it has
some serious and (hopefully) unintended detrimental side effects.

I can't say i'm too happy with that.
I certainly don't regard it as a win.
It looks more like a lose-lose situation to me.

But i don't think we can do much about that.  Groff is still
usable for most users without too much pain.  Unless i want to
contribute significant amounts of code, even i could do so.
And to be fair, even if i wanted to contribute large amounts of
code, the GPL would *not* prevent me from doing that - the thing
the would stop me is the FSF CLA, which is an entirely different
beast.

Nuff' digression!
  Ingo



Re: [PATCH v3] NULL.3const: Add documentation for NULL

2022-07-27 Thread Ingo Schwarze
Hi Alejandro,

Alejandro Colomar wrote on Tue, Jul 26, 2022 at 02:02:56PM +0200:
> On 7/25/22 20:49, Ingo Schwarze wrote:
>> Alejandro Colomar wrote on Sun, Jul 24, 2022 at 09:19:32PM +0200:

>>> +.B 0

>> There is really no need to mark up integer constants.

> groff_man_style(7):
>Use  bold for literal portions of syntax synopses,
>for command‐line options in running text, and  for
>literals  that are major topics of the subject un‐
>der discussion; for example, this page  uses  bold
>for  macro,  string,  and  register  names.  In an
>.EX/.EE example of  interactive  I/O  (such  as  a
>shell session), set only user input in bold.
> 
> Since this is a literal that is a major topic of the subject under 
> discussion, I prefer to be a bit pedantic here and boldface it.
> 
> I guess it's no big issue; does it hurt readability too much for you?

No, it's no big deal and doesn't hurt readability.  It only looks
slightly unusual.

There are cases where a bare '0' character needs to be bold face,
for example when discussing ".Fl 0" in xargs(1).  But here, you
are just talking about the integer 0.  The "major topic" here
is ".Dv NULL", not the ordinary integer 0 that is internally used
to define that constant.
Bold face is mostly for literals that the user needs to type -
and the user is specifically *not* supposed to type this 0.

That's why i consider the .B pointless; then again, it does not
do much harm either.


>>> +.SH CONFORMING TO

>> That should be ".SH STANDARDS".

> We use CONFORMING TO in Linux.  Don't know why; just history, I guess.
> See man-pages(7).

Weird.

I failed to find a single instance of "CONFORMING TO" in AT UNIX
(including v6, PWB, v7, 32v, v8, v10, System III, SVR1, SVR2) nor in
any version of UCB CSRG BSD.  So considering that System V and BSD are
widely considered the two main original branches of the development
of Unix-like operating systems and Linux is often considered to have
drawn inspiration from both, the section name "CONFORMING TO" does
not appear to be a UNIX thing.  For example, Aeleen Frisch, "Essential
System Administration", O'Reilly, Cambridge 1995, considers Linux
as slightly more influenced by 4.3BSD than by System V Release 3.

STANDARDS, on the other hand, is present since 4.3BSD-Reno (June 1990).

4.3BSD-Reno predates the first version of the Linux kernel by more than
a year, and the first Linux manual pages probably for longer than that.

So i have no idea where "CONFORMING TO" may have come from.


[...]
>>> +.SH BUGS

>> The following is misplaced in BUGS.  It is not talking about any bug,
>> nor about any API design defect.

> Oh, I do consider this a bug in the API design.  I placed it there on 
> purpose.
> 
> Allowing the bit pattern of all 0s to represent a valid pointer (and 
> thus not a null pointer) is something that could be meaningful many 
> decades ago, when architectures were more different.
> 
> Nowadays, every arch out there is 2's complement, and uses 0s as the 
> null pointer.  The standard should simplify, and allow memset(2)ing 
> pointers.
> 
> In fact, AFAIK, the next revision of POSIX will fix that bug.  But I 
> don't remember well the details of that.

Fair points.  You convinced me this isn't misplaced in BUGS.

Yours,
  Ingo



Re: [PATCH v4] NULL.3const: Add documentation for NULL

2022-07-27 Thread Ingo Schwarze
Hi,

Alejandro Colomar wrote on Tue, Jul 26, 2022 at 08:47:59PM +0200:
> On 7/26/22 17:54, G. Branden Robinson wrote:
>> Alejandro Colomar wrote:

>>> +.SH NOTES
>>> +The following headers also provide
>>> +.BR NULL :
>>> +.IR  ,
>>> +.IR  ,
>>> +.IR  ,
>>> +.IR  ,
>>> +.IR  ,
>>> +.IR  ,
>>> +and
>>> +.IR  .

>> Not exactly on topic (sorry), but apropos of our concurrent discussion
>> on man pages for constants with external linkage and data types, the
>> foregoing is an excellent counterexample of what I contend is good
>> practice: document them in the man page for the header file.

> For countering that, I'd point you to tm(3type).
> If I document in such detail every type and constant in , the 
> page will become an unreadable mess, IMO.
> 
> NULL(3const) is also a good example.  stddef(0posix) has 2 lines + a 
> blank for NULL.
> 
> Now consider a page that documents NULL, offsetof(), ptrdiff_t, size_t 
> (all of them already documented, so you can take a look at their pages 
> to get an idea of their docs), + wchar_t (yet undocumented).
> 
> stddef.h(0posix) is very short about them, and IMO, it's quite 
> incomplete.  But I could live with it if I had link pages to it (it 
> would be suboptimal to my needs, but I could certainly live with it).

For once, i agree.  Manual pages for header files (like stddef.h(3))
are an even worse idea than manual pages for preprocessor constants
and data types.  The main reason is that almost every header file
includes so much material that such pages would become unreasonably
large, just as Alejandro explains above.  Besides, the material in
a typical header file is not so closely related among itself that
it should share a manual page.  And in fact, documentation for
almost every header file is already split up into many manual
pages: one for each group of closely related functions.

[...]
>> POSIX supplements to the standard C library, and other libraries for C,
>> will not typically have this difficulty, and will have one definition
>> site only for their constants with external linkage.

> Oh, no.  Check ssize_t(3type)!

Indeed.

Then again, while NULL and size_t are exceptional in so far as they
are so widely used that there is no clear place for documenting them,
most types are better documented together with the most typical
function using them.

For example, document  struct tm  in  localtime(3)  and
   ssize_tin  read(2).

Besides, almost nothing needs to be said where ssize_t is documented.
In particular, documentation of %zd belongs with printf(3), not with
ssize_t(3type).

Yours,
  Ingo



Re: [PATCH v3] NULL.3const: Add documentation for NULL

2022-07-25 Thread Ingo Schwarze
Hi Alejandro,

Alejandro Colomar wrote on Sun, Jul 24, 2022 at 09:19:32PM +0200:

> - Move to man3const [Ralph, Branden]
> - Added LIBRARY section
> - Added #include [Ralph]
> - Note that it can also be used as a function pointer [Ralph]
> - Document that 0 is another null pointer constant [Ralph]
>   But note that it's to be avoided by most coding standards [alx]
> - Note that NULL is not NUL
> - Improve wording about zeroing a pointer [Ralph]
>   And refer to getaddrinfo(3) for an example.
>   This probably can be further improved; I'm not convinced.
> - Trim SEE ALSO to just void(3type)
> - Other minor fixes
> 
> v3:
> 
> - Don't boldface 0s, since it doesn't refer to the literal constant 0,
>   but to the bit pattern of 0s.
> - Add list of headers that also define NULL (per POSIX.1-2008).
> 
> 
>  man3const/NULL.3const | 80 +++
>  1 file changed, 80 insertions(+)
>  create mode 100644 man3const/NULL.3const
> 
> diff --git a/man3const/NULL.3const b/man3const/NULL.3const
> new file mode 100644
> index 0..730f670fe
> --- /dev/null
> +++ b/man3const/NULL.3const
> @@ -0,0 +1,80 @@
> +.\" Copyright (c) 2022 by Alejandro Colomar 
> +.\"
> +.\" SPDX-License-Identifier: Linux-man-pages-copyleft
> +.\"
> +.\"
> +.TH NULL 3const 2022-07-22 Linux "Linux Programmer's Manual"
> +.SH NAME
> +NULL \- null pointer constant
> +.SH LIBRARY
> +Standard C library
> +.RI ( libc )
> +.SH SYNOPSIS
> +.nf
> +.B #include 
> +.PP
> +.B "#define NULL  ((void *) 0)"
> +.fi
> +.SH DESCRIPTION
> +.B NULL
> +represents a null pointer constant.
> +.PP
> +According to POSIX,

That phrase is misplaced in the DESCRIPTION.
In belongs into STANDARDS.

Not littering the DESCRIPTION with misplaced information is particular
important in the first two sentences, because that's the first point
of contact for the user where they are likely trying to figure out
what the basic idea of the thing is, and whether they are even looking
at the right manual page.

> +it shall expand to an integer constant expression with the value

Using the word "shall" in a manual page is usually terrible style.
Here, it is misleading on top of that because it is unrelated to
anything the user might be expected to do or not do.

A manual page is neither a standard document formally defining
the language nor a guideline for compiler authors.

Considering this sentence in isolation, what you want to say is:

  The macro
  .B NULL
  expands to the integer number 0 cast to the type
  .IR "void *" .

But you are violating an important guideline for writing manual
pages: avoid useless verbosity, don't say the same thing twice.
Here, you are saying exactly the same *three times*:  in the
SYNOPSIS, in the first sentence of the DESCRIPTION, and in the
second sentence of the DESCRIPTION.

Consequently, i suggest deleting the second sentence with no
replacement.

> +.B 0

There is really no need to mark up integer constants.

> +cast to type

Bad grammar:  s/to type/to the type/.

> +.IR "void *" .
> +.PP
> +A null pointer is one that doesn't point to a valid object or function.

That sounds like an afterthought, which is always bad in documentation.
If you think the first sentence is too vague, integrate this information
into the first sentence, where it obviously belongs.
Besides, this wording is misleading: it sounds as if NULL might be
pointing to an invalid object.

I guess what you mean is:

  .B NULL
  represents a null pointer constant, that is, a pointer that
  does not point to anything.

> +.SH CONFORMING TO

That should be ".SH STANDARDS".

> +C99 and later;
> +POSIX.1-2001 and later.
> +.SH NOTES

I throughly hate this section.  It is almost always an indication
that the organization of the page wasn't properly thought through
and random afterthoughts were dumped here.

> +It is undefined behavior to dereference a null pointer

That is formally true, but hardly helpful in a manual page
because what happens when you dereference a NULL pointer is
faily predictable in practice: a segmentation fault.

Any other behaviour of the C language implementation would be such a
massive security risk that i don't think even the most avid compiler
optimizer would seriously consider it.  According to my practical
experience, NULL pointer accesses are among the most frequent bugs,
easily 20-50% of all bugs the can be found by fuzzing real-world code.
Having the C language do anything else than segfault, for example
continue execution with invalid or uninitialized data, would turn
huge numbers of fairly harmless bugs into potentially exploitable
security vulnerabilities.  Using my experience, off the top my head,
i would estimate that *not* segfaulting on NULL pointer accesses
would, in a typical codebase, increase the number of potentially
exploitable vulnerabilities by roughly one power of ten.

So, if you want to be helpful to the reader, you should say
something like:

  While dereferencing a NULL pointer is formally undefined
  

Re: All caps .TH page title

2022-07-23 Thread Ingo Schwarze
Hi Alejandro,

On 7/22/22 12:35, Alejandro Colomar (man-pages) wrote:

> BTW, I think I didn't reply (or if I did was very short) to your comment 
> that other languages may find it difficult to mirror our use of 
> subsections, since their main section is already a subsection (e.g., 
> 3pl).

Other languages are usually better off to live *outside* the $MANPATH
and tell users to use "man -M" to access their documentation.
For example, on OpenBSD, the TCL manuals live
in /usr/local/lib/tcl/tcl8.5/man/ .
Putting them into /usr/local/man/ would be quite disruptive because
that would cause lots of clashes, including "apply", "break", "cd",
"close", "eval", "exec", "exit", "expr", "glob", "info", "join", "open",
"puts", "pwd", "read", "socket", "time", and so on.  I expect most
other language will cause similar noise.
Perl is better because the clashing names are mostly part of perlfunc(1),
and the majority of other Perl manual page names contain colons.
FORTRAN (traditionally in man3f) is also better because in this
instance, the cryptic FORTAN six-letter identifiers become a virtue
in so far as they prevent clashes.

> I'd say that since C is the native Unix language, and others are 
> just... others?, I'd optimize for C, and let other languages find
> a way to document their things.

True - not because C is better or more worthy, but because looking up
an identifier logically requires specifying the language, and as
explained elsewhere, section suffixes are a poor solution for that.

> It would be easy to say just go away, the man pages are for C,

Absolutely not.  While the manual page format may have some feature
that are particularly well-suited to documenting C, it is not
limited to that role.

> but I won't dare to say that, since I like man pages, 
> and I'd like to see more documentation for languages that I sometimes 
> have to use be in the form of man pages, so I'll try to come up with a 
> more imaginative answer:  how about using subsubsections of the form 
> 3pl_type?  At least it's a possibility.  man(1) would handle them as any 
> other subsection, but that's not a big problem.  Maybe man(1) could 
> develop a way to provide subsubsections...  Colin, any ideas in this 
> regard?

See above.

Alejandro Colomar wrote on Fri, Jul 22, 2022 at 01:46:37PM +0200:

> Or, maybe it's the time to write a whole new volume?  I think there's a 
> comparable difference between 3type and 3 than between 2 and 3 or 1 and 
> 8, so it would be merited.  I didn't do it before for two reasons: it 
> might break software that assumes than Unix manuals use a single number 
> followed by an optional string (I'd say it's not a fair assumption to 
> say that man9 would be the last one ever used; if there's 9, there might 
> be a 10 some day), and because other projects had already used 3type.
> 
> But, that would start a clean namespace.  Maybe it's worth it.

No, that would absolutely not be clean design.  I advise strongly
against it.

First, concatening integer numbers and strings is often bad design
because it significantly complicates processing in various way.
For example, the traps set by the strtol(3) family of functions,
in particular regarding trailing non-digit characters, are
legendary.  Bugs love the breeding ground.  As another example,
numerical vs. alphabetical sorting is a similarly famous trap,
consider the difference of sort(1) vs. "sont -n".  I'm sure
you do *not* want to design a data type represented as a string
such that the first part needs to be sorted numerically and the
second part needs to be sorted alphanumerically - with not even
a delimiting character in between.

Less technically, having a small number of sections with non-desciptive
names is fine; people get used to the meaning of 1 to 9.  But when
you start adding more sections, a scheme with non-descriptive names
sooner or later becomes unsustainable.  "What was section 42 again...
i guess i'll have to look that up."

So the design already strikes me as terrible even before starting
to sonsider portability.
I would expect no end of compatibility problems.

 * Most man(1) implementations will probably treat section 10
   as a subsection of section 1.
 * While "man -s 10" may work with some man(1) implementations,
   "man 10 wchar_t" will fail saying
 man: No entry for 10 in the manual.
   on most.
 * When sorting, you will probably get 1, 10, 11, 2, 3, ...
 * I expect lots of code does
 char sc = '1' + sn - 1;
 asprintf(, "%s.%c%s", name, sc, suffix);
   which leaves you with "name.:suffix" if sn == 10.
 * ...

> How would you feel if I inaugurate man10 for types, and later man11 for 
> non-function-like macros? :D

I wouldn't feel well at all.
I think i'd prefer contracting a common cold to having to deal with that.

Yours,
  Ingo



Re: All caps .TH page title

2022-07-23 Thread Ingo Schwarze
Hi Doug,

G. Branden Robinson wrote on Fri, Jul 22, 2022 at 07:48:21AM -0500:
> At 2022-07-22T07:36:03-0400, Douglas McIlroy wrote:

>> Changing the .TH case convention throughout the Unix world is about
>> as futile an effort as English spelling reform.

> I love a challenge.

In addition to that, i'm not convinced it's fair to call an accessibility
effort "futile".

> True, but on the one hand I don't mind, and on the other, as indicated
> by the start of this thread, Alejandro is seriously considering doing so
> for the Linux man-pages project, the corpus of which is large

No guarantee, but i might also consider doing it in OpenBSD base.
That corpus is roughly 85% larger than the Linux man pages project.
Of course, i need to find the time, and it requires that no OpenBSD
developers strongly object, but since it does improve accessibility,
i don't really expect that much opposition.

Yours,
  Ingo


P.S.
The little troll inside me thinks that besides, wouldn't it be
wonderful if English spelling were as consistent and as easy to
learn as French spelling?  (No, i'm not proposing an English
spelling reform: that's not my department.)



Re: All caps .TH page title

2022-07-22 Thread Ingo Schwarze
Hi,

Colin Watson wrote on Fri, Jul 22, 2022 at 01:22:57AM +0100:

> man-db doesn't index on the .TH section at all, and I don't believe
> I've encountered the practice of doing so in other indexers
> (I could be wrong, but I think that's something I would have
> remembered if I'd noticed it).

FWIW, the mandoc(1) implementation of the indexes uses the following
to derive names for a page:

 * the first component of the file name,
   including of hard, soft, and .so links
 * in man(7) pages, the first argument of the .TH macro<<=
 * in man(7) pages, any words preceding the first "-" or "\-"
   in the NAME section
 * in mdoc(7) pages, the first argument of the .Dt macro   <<=
 * in mdoc(7) pages, arguments of .Nm macros in the NAME section
 * in mdoc(7) pages, arguments of .Nm macros in the SYNOPSIS
 * in mdoc(7) pages, first arguments of .Fo and .Fn macros in the SYNOPSIS

The last two - mdoc(7) SYNOPSIS content - are only used for man -k
searches explicitly specifying the Nm search key.  All others are also
used for plain man(1) name lookup.

In mandoc the following are used as section names:

 * if the directory name starts with "man" or "cat", the rest of it
 * the file name suffix, starting after the last dot
 * in man(7) pages, the second argument of the .TH macro
 * in mdoc(7) pages, the second argument of the .Dt macro

The above rules often cause pages to end up with more than on name
and and more than one section.  For example, a file called

  man3p/strcpy.3

containing

  .TH strlcat 3bsd
  .SH NAME
  wcslcpy, wcslcat \- yadayada

can be found with any of the following commands, and several more:

  man 3bsd strcpy
  man 3 strlcat
  man wcslcat
  man 3p wcslcpy

As a special case, if the .TH or .Dt argument agrees with one among
the other names but differs in case, it is not used, because
mandoc assumes the other name is likely correctly cased while
the name in the .TH or .Dt macro may have been converted to all caps.

> man-db's man(1) performs case-insensitive lookups by default, which
> I've found to be more useful behaviour.  Case-sensitive lookup can be
> requested using the -I/--match-case option.

In the mandoc implementation, plain man(1) follows the tradition of
being case-sensitive, but i must admit examples of manual pages with
names that differ only in case are hard to find indeed, so it might
make sense to change this and make it agree with man-db.

In the mandoc implementation of apropos(1), searches use case-insensitive
extended regular expressions by default (which originally was a
proposal coming from FreeBSD).  If the regular expression operator '~'
is explicitly specified, the search becomes case-sensitive.  If the -i
option is given, it becomes case-insensitive again.  The substring-search
operator '=' always remains case-insensitive no matter what.

> As far as I know this was an innovation when I introduced it in 2002,
> so I don't know how widespread this behaviour is among man(1)
> implementations.  You probably can't rely on it.
> 
> But in any case, as above, changing the arguments to .TH doesn't affect
> this.  I haven't noticed it being widespread practice to spuriously
> capitalize the name part of lines in the "NAME" section.

Indeed, doing that would be very bad style, not least because it would
make the correct capitalization of the name hard to find for the
human reader of the manual page.

Yours,
  Ingo



[bug #40720] [UPGRADE] improve Unicode support

2022-07-14 Thread Ingo Schwarze
Follow-up Comment #4, bug #40720 (project groff):

[comment #3 comment #3:]
> Back in '04, Werner posted an overview of how to start tackling this:
http://lists.gnu.org/r/groff/2004-05/msg00074.html

That's not really an overview but merely a single, partial idea with no
context.
Essentially, Werner wrote nothing but: "Widen the internal input character
support to 32bit."
Besides, that idea is likely controversial.
While i do not deny that it is usually *possible* to change a codebase from
using char * strings internally to using wchar_t * strings internally, use
MB-to-wide string conversion functions on input and use wide-to-MB string
conversion functions on output, this kind of change is about as disruptive as
anything can be in a codebase: the result usually is that you need code
changes in *almost everything* because few code bases have much code that
neither inputs nor processes nor outputs strings.  Certainly not groff.  So
given that most files and most functions in the groff code base will likely
have to be changed, Werner's dismissive statement "It's not very complicated"
feels badly misleading to me.  Werner's cautionary parenthetic remark "(at
least at the beginning)" does not prevent his statement from being misleading:
changing the input character type to wchar_t gets maximally intrusive *right
away*, not at some later point.  You have to uproot the whole codebase
*before* you can even start doing anything productive with those massive
changes.

The idea Werner attributes to Bernd (which should already give us pause: Bernd
is not exactly known for good software design) is even worse: "Create a new
type for input character codes."  That is not even controversial, it is an
obviously terrible idea.  You do not create a new type when a type for exactly
that purpose already exists in the C standard.  I admit that many people do
just that for a variety of reason - sometimes because they are simply
unfamiliar with the C and POSIX standards, sometimes in misguided attempts at
portability, sometimes out of sheer NIH syndrome.  So *if* you want to change
the input character type to a wide character type (which i wouldn't want to),
you should at the very least use a C standard type.

This ticket is throwing around suggestions of massively intrusive changes
proposed almost two decades ago without bothering to say - even in the vaguest
terms - what the problem really is.  Let me claim this has already been mostly
solved during the last two decades, and the very old mail you are quoting is
simply completely outdated.  For example, just today, a user asked on our
mailing list how to use emoji characters in groff, and a working solution was
readily proposed to him.  Yes, i know the Unicode standard contains lots of
advanced features and some of them may be non-trivial to implement with our
current preconv(1)-based scheme.  But even if we linked groff to some massive
icu4c-style library, which would be an even heavier and more intrusive change
than Werner's decade-old and likely outdated suggestion, *complete* Unicode
support in groff would likely still be more work than rewriting groff
completely from scratch.  So throwing around suggestions for massive changes
serves little purpose until we have a conversation about exactly which
features we want, why the proposed massive changes are the most reasonable way
to implement these specific features, and who is willing to develop both the
fundamental rewrite of the code base and the target changes on top of the new
code base.

Finally, let me point out that how groff currently handles wide characters -
support wide characters both on the input and output side while keeping the
code simple by mostly using plain char[] strings internally - is actually one
good way for keeping wide character support simple in some circumstances.  For
details, see my presentation at the 2016 EuroBSDCon:
"Why and how you ought to keep multibyte character support simple"
https://www.openbsd.org/papers/eurobsdcon2016-utf8.pdf
https://www.openbsd.org/papers/eurobsdcon2016-utf8.roff
The parts most relevant in this context are pages 1-5 and 19-39.
I'm not saying groff needs the *specific* techniques presented here - groff
almost certainly is more complicated than the programs discussed in my talk -
but rather that the existing preconv(1) approach and its simplicity and
modularity has striking similarities to what is discussed here, and likely is
a good approach, whereas the full mbstowcs(3)->wchar_t->wcstombs(3) dance is
much harder than most people think and causes lots of little-known trouble in
practice.



___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




Re: Learning groff eqn (was: Typesetting Mathematics by Kernighan and Cherry, retypeset)

2022-07-02 Thread Ingo Schwarze
Hi Branden,

G. Branden Robinson wrote on Sat, Jul 02, 2022 at 11:57:21AM -0500:

> I was thinking this morning that at least groff's lex.cpp for eqn could
> be translated into EBNF for a groff_eqn(7) page; that way the extensions
> would be documented too.

In fact, i'm not a fan of putting BNF into user-facing documentation.
It is good for a language definition in a formal standard because it
is relatively precise for describing syntax, compared to less formal
ways of describing syntax.

But for user documentation, the downside of separating the description
of syntax and semantics, and the downside of BNF being less readable
than a less formal syntax description, usually outweigh the benefit of
higher precision.

To show an extreme exaple, look at:

  https://man.openbsd.org/pf.conf.5#GRAMMAR

Reading that, do you understand what is going on?  At least in that
case, the manual gives the BNF *in addition to* and *after*
describing the syntax and semantics in a more accessible manner.

The reason i so far refrained from changing the structure
of the https://man.openbsd.org/eqn.7 page is that the BNF in there
is reasonably short and simple.  Nonetheless, it's a chunk of rock
that may sit heavily on some people's stomachs, and it is hardly a
model to be followed.

[...]
> So what I would like to see is an _original_ document introducing
> the novice to GNU eqn.
[... reordered ...]
> Before we have such a thing in
> tutorial form, we should probably have a comprehensive eqn language
> _reference_, and a groff_eqn(7) page could well be the vehicle for that.

I certainly don't object to that approach.
What you say sounds reasoable to me.

By the way, it looks like Ted Harding did something like that:

  https://lists.gnu.org/archive/html/groff/2013-10/pdfTyBN2VWR1c.pdf

[...]
> But it's still a good document.  One of my aims has been to play a
> kind of museum (or art) gallery curator: clean it up, increase its
> accessibility, and help the people (including myself) see it and learn
> from it.  For that purpose to be truly fulfilled, it will need to be
> hosted somewhere; we'll see if anyone does so.  :)

Indeed, a mailing list is about the opposite of a museum: Prone to
disappear at some point, completely unstructured and unsorted, and hard
to find - except with web search engines, but those inevitable have a
terrible signal-to-noise ratio.

With respect to hosting, adding it to https://troff.org/papers.html
would seem perfect to me.

Yours,
  Ingo



Re: Typesetting Mathematics by Kernighan and Cherry, retypeset

2022-07-02 Thread Ingo Schwarze
Hi Branden,

G. Branden Robinson wrote on Fri, Jul 01, 2022 at 12:58:35AM -0500:

> Joerg van den Hoff's recent question on the bug-groff list motivated me
> to look again at eqn.  groff's eqn(1) man page is not useful to learn
> the program, even Schwarze-style--it is avowedly incomplete, mostly
> documenting only differences from the AT implementation.

https://man.openbsd.org/eqn.7 is useful in that sense, though.
It isn't Schwarze-style either but Kristaps-style instead.
The main advantage of that document is probably conciseness.
Then again, the original Kernighan/Cherry user guides have the
same advantage of not being long, and in addition to that,
they do make learning easier.

> Also befuddled by the fact that our man page talks about "macros" (for
> eqn, not *roff), but the AT documentation pointedly doesn't, I decided
> to go back to the widely praised User's Guide by Kernighan & Cherry with
> as few assumptions as possible and see what I could learn.

The mandoc eqn(7) page linked above avoids the term "macro", too,
precisely because it invites confusion with roff(7) macros.

> I furthermore wanted to retypeset this document with groff since its
> source is available and the V7 Unix Programmer's Manual Volume 2 scans
> on the Web are caked with flyspecks and other unpleasantness.

I understand UNIX v7 is under this BSD-style license by Caldera Inc.:

  https://www.tuhs.org/Archive/Caldera-license.pdf

Does that include the directory usr/doc/eqn/?  It seems likely because
the license contains the words "source code and documentation",
and a "User Guide" very probably qualifies as documentation.
Then again, the word "documentation" is only mentioned in the
context of requirements for redistribution.  In the context of
granting rights, the license only says:

  The following copyright notice applies to the source code files
  for which this license is granted.

So if you read it in a very picky way, you might suspect that no
license whatsoever is granted for documentation files, and the
subsequent requirement regarding documentation has no effect
because these files aren't licensed in the first place.
But i think Caldera probably *intended* to also license the
documentation and simply failed to make that unambiguous -
surprising as that may seem given that the "Director, Licensing
Services" signed the letter, and i would expect such a person to
know what they are doing.  What do you think?

If we feel unsure about the licensing status of this document, we
could also ask Brian Kernighan directly.  It seems possible to me
(though not certain) that maybe he never re-assigned Copyright of
this document to AT remember that originally, it was an article
in the "Communications of the ACM", so it may or may not have been
subject to his working contract with Bell Labs, and i don't know
what that contract said.

*If* this document is indeed freely licensed, would it make sense
to include it in the groff distribution?  It could serve three
useful roles: (1) supplementary, high quality tutorial-style
documentation, (2) providing informatiuon about portability,
and (3) a classical example for the use of groff_ms(7).

[...]
> I made some very small changes to the source material, but none to
> anything one might consider narrative; they are all commented and
> explained.  The most important avoids lying about when the document was
> rendered.  I used groff extensions unapologetically (but there wasn't
> much to do).

Putting the document into the GNU roff tree would also provide
the benefit of putting these changes under version control (of
course, they should be kept minimal).

Yours,
  Ingo



mdoc(7) prologue regressions

2022-06-26 Thread Ingo Schwarze
Hi,

after getting the build system issues mostly out of the way,
i proceeded to run-time testing of groff-current.

The first issue i identified is a group of regressions in the
behaviour of the mdoc(7) prologue macros .Dt and .Os.
The regressions aren't particularly severe because all that i found
so far only trigger when the document uses these macros incorrectly.
All the same, i'd like to report them such that we can decide
whether we want to fix some or all of them.

I suspect that this commit might be responsible but admit
that i did not prove this suspicion by testing right before
and right after the commit.  I only tested that the behaviour
changed as described below from groff-1.22.4 to groff-current:

  commit a1e6c19176d38823d8dc6c9a619a493ca90bdca4
  Author: G. Branden Robinson 
  Date:   Sun Oct 3 23:15:12 2021 +1100

  [andoc,man,mdoc]: Fix Savannah #61266.

  Resolve problems in batch rendering of man pages to PDF arising from
  entanglement of end-of-input traps, page location traps, continuous
  rendering mode, and andoc's reloading of the (m)an and (m)doc packages.
  [...]


 1. When there are two .Dt macros in the prologue, the last one used
to win, setting the page title, section number, and section title.
Now, the first one wins, setting these fields.

 2. When a .Dt macro occurs in the body of the page (as opposed to
in the prologue), it used to be ignored.  Now, it causes a
large number of blank lines in the output.

Both issue 1 and issue 2 can be seen with this test file:
https://cvsweb.openbsd.org/cgi-bin/cvsweb/~checkout~/src/regress/usr.bin/mandoc/mdoc/Dt/dupe.in?rev=1.4=text/plain

 3. When the first .Dt macro comes late, the page title used to be
set to "UNTITLED".  Now, it is set to the empty string.

Both issue 2 and issue 3 can be seen with this test file:
https://cvsweb.openbsd.org/cgi-bin/cvsweb/~checkout~/src/regress/usr.bin/mandoc/mdoc/Dt/late.in?rev=1.2=text/plain

 4. If there is no .Dt macro at all, the page title used to be
set to "UNTITLED".  Now, it is set to the empty string, see:

https://cvsweb.openbsd.org/cgi-bin/cvsweb/~checkout~/src/regress/usr.bin/mandoc/mdoc/Dt/missing.in?rev=1.2=text/plain

 5. When the usual order of .Dt and .Os is exchanged,
the .Dt macro is now completely ignored, setting the page title
to the empty string and the section title to "LOCAL", see

https://cvsweb.openbsd.org/cgi-bin/cvsweb/~checkout~/src/regress/usr.bin/mandoc/mdoc/Dt/order.in?rev=1.2=text/plain

 6. The same regression as for issue 5 occurs when there are two .Os
macros in the order .Dd .Os .Dt .Os, see

https://cvsweb.openbsd.org/cgi-bin/cvsweb/~checkout~/src/regress/usr.bin/mandoc/mdoc/Os/dupe.in?rev=1.3=text/plain

 7. When the .Os macro comes late - i.e. in the body of the page
rather than at the usual place in the prologue -
the header line now appears at that place in the middle of the
body and no longer at the top of the manual page where it belongs, see

https://cvsweb.openbsd.org/cgi-bin/cvsweb/~checkout~/src/regress/usr.bin/mandoc/mdoc/Os/late.in?rev=1.2=text/plain

 8. When the .Os macro is completely missing, the header line is no
loger printed at all, see

https://cvsweb.openbsd.org/cgi-bin/cvsweb/~checkout~/src/regress/usr.bin/mandoc/mdoc/Os/missing.in?rev=1.2=text/plain

The most severe issue is probably number 8 because forgetting the .Os
macro, or thinking it might be optional, might even happen in real-world
manual pages.  The next most severe would then be issue 5 because
mixing up the order might also happen in practice.

Number 7 is also somewhat unfortunate.  While not quite as likely to
happen as putting the .Os macro at the wrong place *inside* the prologue,
the effect produced is very ugly.  Similarly, issue 2 is unlikely
to occur in practice, but the effect is also very ugly.

The remaining issues 1, 3, 4, and 6 are less severe.  But in case we
decide that some of the more severe regressions need fixing, maybe
properly fixing them all might not cause that much extra work?
In any case, i thought listing them all might potentially be useful.

Yours,
  Ingo



Re: man(7) .TH font change, was: groff man(7) `B` macro...

2022-06-26 Thread Ingo Schwarze
Steffen Nurpmeso wrote on Mon, Jun 20, 2022 at 02:23:58PM +0200:

> Just to mention that since 2014 my .Mx mdoc(7) extension is
> distributed for the things i use, and i never have heard about an

Just in case people aren't aware, even though this was repeatedly
discussed in the past:  mdocmx is a textbook example of overengineering:
horrendous complexity for almost no gain.  Most of what it aims for can be
done without it and does not require any new syntax; the few aspects it
implements that cannot be done without new syntax are next to irrelevant
for practical purposes.  It is good that nobody except Steffen uses it.



Re: build system: devpdf/download regression

2022-06-26 Thread Ingo Schwarze
Hi Deri,

thanks for your extensive explanation.

Deri wrote on Thu, Jun 23, 2022 at 05:32:10PM +0100:
> On Thursday, 23 June 2022 14:05:29 BST Ingo Schwarze wrote:

>>  1. Define a configuration variable.
>>  2. Unconditionally set that variable to a sane default value,
>>  3. Run some autoconfiguration tests that change configuration
>>  4. Use the configuration variable from source files during the

> You are positively correct.
> You can tell I have never looked after a big project!

I believe the above configuration scheme is also useful for
small projects (like groff or mandoc) and not only for large
projects (like texlive or mozilla or gnome).
Myself, i resent even looking *at* large projects,
let alone looking *after* them...  ;-)

> In the same way that the choice of URW directory is passed into 
> Foundry.in, although I would prefer separate variables, one for
> the users choice and one for the static paths.

I didn't make make up my mind whether that is a good idea, it may
or may not be.

In general, i would recommend differentiating configuration variables
by functionality.  For example, if all you need is one font path,
i would expect having one variable to configure it.  Two variables
would make sense to me if you have two different purposes or tasks,
for example, on the one hand a font path that gs(1) and groff *read*
already installed fonts from, and on the other hand a directory
that groff *writes* its own font-related files to.

But it would seem unusual to me to differentiate a configuration
variable according to the *source* of the information.  For
example, if components of the font path can come from three sources,
 1. a static list of directories that you collect over the years
by talking to people using different operating systems;
 2. autoconfiguration results, for example from `gs -h`; and
 3. user configuration, for example from a ./configure option
then i would still expect one single variable since the whole
point of autoconfiguration is overriding or supplementing static
defaults and the whole point of user configuration id overriding or
supplementing both.

What would be the point of splitting the variable according
to the source of information?  (There may well be a point
that i'm still missing.)

[...]
> If I get the chance, as I've got a version of ghostscript with separate
> font  files, I'll do an strace on file openings to see if it actually
> loads the files if it has not been told to embed all fonts, my hunch is
> that it does not need to access the font unless it needs to embed a font
> which is not supplied by the postscript or pdf file it is distilling.

While this may be an interesting question to you, i don't think
it affects how things should be installed.  No doubt embedding a font
in the output files that isn't already embedded in the input files is
functionality that gs(1) needs to provide.  So the needed fonts
need to be
 * embedded via %rom% (as currently done on OpenBSD) or
 * installed as separate files as part of the main ghostscript package or
 * installed as separate files in a separate package (also provided
   by OpenBSD) that the main ghostscript package requires as a
   dependency (which OpenBSD does not currently require, but that's
   OK for now because %rom% is being used)

> If the results of gs -h include the text %rom%

They do on OpenBSD.

> I believe this means some files have been baked in, but you should
> talk to the ghostscript packager. One wrinkle, if you go with just
> one copy of the fonts and use symlinks is that ghostscript uses
> different names for the fonts than the old ms-dos names.

Since there are quite a few aspects to consider (including
different kinds of fonts even when considering default and URW
fonts only, and the naming business), i think i will focus on
polishing the groff port for now, to help making the groff
release as good as possible, and likely return to the question
whether and how the ghostscript package can be polished once
the groff release and the update of the groff port are done.

That seems best because i do not doubt that our ghostscript
ports in their current form provide everything groff needs,
so staying focussed is possible.

Yours,
  Ingo



Re: build system: devpdf/download regression

2022-06-23 Thread Ingo Schwarze
Hi Deri and Branden,

Deri wrote on Thu, Jun 23, 2022 at 12:56:16AM +0100:
> On Wednesday, 22 June 2022 22:29:29 BST G. Branden Robinson wrote:
>> At 2022-06-22T21:30:25+0100, Deri wrote:

[...]
>>> I think the tests for awk and ghostscript need to just apply to the
>>> line which uses those programs, not the whole section, but the patch
>>> is good.

>> I'm happy to change GROFF_URW_FONTS_CHECK in this manner, but I'll hold
>> off until you or Ingo pushes his patch.

Done.  So, Branden, go wild!

I deliberately didn't add a ChangeLog entry because you two are about
to change this same code in a more fondamental way, and when you commit
the future patch that will ensure the font paths automatically stay
in sync, any ChangeLog entry i might have added would become obsolete
right away.

[...]
> I would suggest I add a new parameter to Foundry.in:-
> 
> static_paths|/usr/share/fonts/type1/gsfonts /usr/share/fonts/default/Type1 /
> usr/share/fonts/default/Type1/adobestd35 /usr/share/fonts/type1/urw-base35 /
> opt/local/share/fonts/urw-fonts /usr/local/share/fonts/ghostscript
> 
> Then the groff.m4 check can pull the paths from the file.

I don't feel strongly about this, i just wonder.  Isn't the usual
way of autoconfiguration as follows:

 1. Define a configuration variable.
 2. Unconditionally set that variable to a sane default value,
in this case, the default font path currently duplicated
in groff.m4 and Foundry.in.
 3. Run some autoconfiguration tests that change configuration
variables as appropriate for the system we are running on.
In this case, they would likely do something like
adding directories to the font path configuration variable
defined in steps 1 and 2.
 4. Use the configuration variable from source files during the
build, in this case probably from the Foundry and maybe
from others, too.

Then again, if you see strong technical reasons to do things
backwards in this particular case, that is, have the autoconfiguration
system read defaults from source files that are intended for
being processed much later (in this case, the Foundry), so be it.

>>> Since the fonts don't appear in any of the directories yielded by "gs
>>> -h" means that the person porting ghostscript for your system decided
>>> to use the option to have the fonts embedded in the gs executable
>>> (%rom%) rather than as separate font files. I'm not sure if there is
>>> much advantage with modern hardware. Here's a chap asking where the
>>> font files have gone:-
>>> 
>>> https://stackoverflow.com/questions/38331893/ghostscript-fonts-folder-remo
>>> ved-from-later-versions

Let me paraphrase to see whether i understood this correctly.

The gs(1) program generally needs access to a specific set of fonts.
Consequently, for the main ghostscript package to be useful, these
fonts need to included in the main ghostscript package (as opposed
to be provided by a separate fonts package).

There are two alternative ways how these fonts can be included in
the main ghostscript package: Either as separate font files; that's
the old way.  Or embedded into the gs(1) executable itself; that's
what upstream (the ghostscript project) now recommends.  But you
say benefits from embedding these fonts are likely negligible.

In OpenBSD, we have these fonts *both* embedded in the main gs(1)
executable (such that they can be used even if our ghostscript-fonts
package is not installed) *and* packaged separately in the
ghostscript-fonts package (such that groff ./configure can detect
them and the groff PDF Foundry can generate font descriptions for
them even when the main ghostscript package is not installed).

But theat means if somebody wants to do both a full groff build
and run gropdf(1) on the same OpenBSD machine, they actually need
to install *two* copies of these fonts: the ghostscript-fonts
package for ./configure and Foundry and the main ghostscript
package for run-time usage by gropdf(1).

Do you think it would be an improvement to remove the fonts from
the gs(1) executable (and consequently, from the main ghostscript
package) and instead have the main ghostscript package depend on
the ghostscript-fonts package?

>> I wonder if we should detect this and warn about it.  I've gotten
>> pretty familiar with our Autoconf macros.

> I don't think so. If the URW fonts are not found the just suggesting to 
> install the URW fonts is sufficient. They can't do much if their packaged 
> ghostscript has the fonts baked into the executable, but they can install a 
> URW fonts package or tar-ball.

Well, if i understand correctly, we *can* do something about that.
If the fonts are "baked into the executable" in the main ghostscript
package, *only* gs(1) can use them, but no other program (for example,
groff ./configure and Foundry).  We could change our main ghostscript
package to depend properly on installed fonts instead of "baking in",
such that other programs can use these fonts, too, couldn't we?
That would seeem 

[groff] 01/01: sync the font path from the PDF Foundry to ./configure

2022-06-23 Thread Ingo Schwarze
schwarze pushed a commit to branch master
in repository groff.

commit 0f6c779cf4f18457a60cb06b774009670ab90c4b
Author: Ingo Schwarze 
AuthorDate: Thu Jun 23 13:57:55 2022 +0200

sync the font path from the PDF Foundry to ./configure

OK deri@ gbranden@
---
 m4/groff.m4 | 15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/m4/groff.m4 b/m4/groff.m4
index 8740ce10..6f43f956 100644
--- a/m4/groff.m4
+++ b/m4/groff.m4
@@ -288,10 +288,8 @@ AC_DEFUN([GROFF_URW_FONTS_PATH], [
 ])
 
 # Check availability of URW fonts in the search path given by 'gs -h'
-# supplemented with
-# /usr/share/fonts/type1/gsfonts/:/opt/local/share/fonts/urw-fonts
-# (where font/devpdf/Foundry.in expects them), or in the custom
-# directory passed to 'configure'.
+# supplemented with the paths where font/devpdf/Foundry.in expects them,
+# or in the custom directory passed to 'configure'.
 
 AC_DEFUN([GROFF_URW_FONTS_CHECK], [
   AC_REQUIRE([GROFF_AWK_PATH])
@@ -301,8 +299,13 @@ AC_DEFUN([GROFF_URW_FONTS_CHECK], [
   then
 AC_MSG_CHECKING([for URW fonts in Type 1/PFB format])
 _list_paths=`$GHOSTSCRIPT -h | $AWK 'BEGIN { found = 0 } /Search path:/ { 
found = 1 } /^[ ]*\// { print $'0' }'| tr : ' '`
-_list_paths="$_list_paths /usr/share/fonts/type1/gsfonts/ \
-  /opt/local/share/fonts/urw-fonts/"
+_list_paths="$_list_paths \
+  /usr/share/fonts/type1/gsfonts/ \
+  /usr/share/fonts/default/Type1/ \
+  /usr/share/fonts/default/Type1/adobestd35/ \
+  /usr/share/fonts/type1/urw-base35/ \
+  /opt/local/share/fonts/urw-fonts/ \
+  /usr/local/share/fonts/ghostscript/"
 if test -n "$urwfontsdir"
 then
   _list_paths="$ _list_paths $urwfontsdir"

___
Groff-commit mailing list
Groff-commit@gnu.org
https://lists.gnu.org/mailman/listinfo/groff-commit


  1   2   3   4   5   6   7   8   9   10   >