Hi Klemens,

Ingo Schwarze wrote on Tue, Feb 18, 2020 at 04:30:53PM +0100:

> While i don't strongly object to the patch, it might be worth holding
> off a bit on manually tagging of .Sh given that even automatic
> tagging isn't done for that macro yet.  Which would mean postponing
> this patch until automatic tagging for .Sh has been decided and
> implemented.
> 
> Given that this is essentially manual tagging of .Sh, if you decide
> to put this in for the time being until automatic tagging of .Sh
> may arrive, i agree with the two developers who said that
> 
>   .Tg asn1parse
>   .Sh ASN1PARSE
> 
> would feel more natural,

So, you did put this in, which is of course fine with me.

> even though for now, that puts the tag two lines early in
> terminal output.  But at the latest when automatic tagging for
> .Sh gets implemented, i will certainly fix that offset.

I just fixed that two-line offset, mainly because it came almost
for free as a by-product of starting to systematically improve HTML
output, in particular starting to get rid of <mark> elements where
they are not really needed.

I'm appending the committed patch below such that people who want
can easily review how the HTML output now looks like for explicitly
tagged section headers (in the regress/mdoc/Sh/tag.out_html file
contained in the patch).

I also installed the new code on man.openbsd.org:

  https://man.openbsd.org/openssl#s_client
  https://man.openbsd.org/route#add

The following works unchanged because we already had automatic
tagging for section and subsection headers in HTML output:

  https://man.openbsd.org/openssl#COMMON_NOTATION

In terminal output, i did not implement automatic tagging for section
and subsection headers yet.  But given the current state of NODE_ID
support, that's now trivial to implement if people would like to see
it.  The benefit isn't huge, but it might add a bit of consistency.
For example, right now, in ksh(1), these work:

  /^   Quoting
  /^FILES

Adding automatic tagging for .Sh and .Ss would make these two
do just the same:

  :tQuoting
  :tFILES

Yours,
  Ingo


Log Message:
-----------
Fully support explicit tagging of .Sh and .Ss.
This fixes the offset of two lines in terminal output
and this improves HTML output by putting the id= attribute
and <a> element into the respective <h1> or <h2> element rather
than writing an additional <mark> element.

To that end, introduce node flags NODE_ID (to make the node a link
target, for example by writing an HTML id= attribute or by calling
tag_put()) and NODE_HREF (to make the node a link source, used only
in HTML output, used only to write an <a class="permalink"> element).

In particular:
* In the validator, generalize the concept of the "next node"
such that it also works before .Sh and .Ss.
* If the first argument of .Tg is empty, don't forget to complain
if there are additional arguments, which will be ignored.
* In the terminal formatter, support writing of explicit tags
for all kinds of nodes, not just for .Tg.
* In deroff(), allow nodes to have an explicit string representation
even when they aren't text nodes.  Use this for explicitly tagged
section headers.  Suprisingly, this is sufficient to make HTML
output work, without explicit code changes in the HTML formatter.
* In syntax tree output, display NODE_ID and NODE_HREF.

Modified Files:
--------------
    mandoc:
        mdoc_term.c
        mdoc_validate.c
        roff.c
        roff.h
        tree.c
    mandoc/regress/mdoc/Sh:
        Makefile

Added Files:
-----------
    mandoc/regress/mdoc/Sh:
        tag.in
        tag.out_ascii
        tag.out_html
        tag.out_markdown

Revision Data
-------------
Index: roff.h
===================================================================
RCS file: /home/cvs/mandoc/mandoc/roff.h,v
retrieving revision 1.71
retrieving revision 1.72
diff -Lroff.h -Lroff.h -u -p -r1.71 -r1.72
--- roff.h
+++ roff.h
@@ -522,6 +522,8 @@ struct      roff_node {
 #define        NODE_NOFILL      (1 << 8)  /* Fill mode switched off. */
 #define        NODE_NOSRC       (1 << 9)  /* Generated node, not in input 
file. */
 #define        NODE_NOPRT       (1 << 10) /* Shall not print anything. */
+#define        NODE_ID          (1 << 11) /* Target for deep linking. */
+#define        NODE_HREF        (1 << 12) /* Link to another place in this 
page. */
        int               prev_font; /* Before entering this node. */
        int               aux;     /* Decoded node data, type-dependent. */
        enum roff_tok     tok;     /* Request or macro ID. */
Index: tree.c
===================================================================
RCS file: /home/cvs/mandoc/mandoc/tree.c,v
retrieving revision 1.85
retrieving revision 1.86
diff -Ltree.c -Ltree.c -u -p -r1.85 -r1.86
--- tree.c
+++ tree.c
@@ -199,6 +199,13 @@ print_mdoc(const struct roff_node *n, in
                        putchar(')');
                if (n->flags & NODE_EOS)
                        putchar('.');
+               if (n->flags & NODE_ID) {
+                       printf(" ID");
+                       if (n->string != NULL)
+                               printf("=%s", n->string);
+               }
+               if (n->flags & NODE_HREF)
+                       printf(" HREF");
                if (n->flags & NODE_BROKEN)
                        printf(" BROKEN");
                if (n->flags & NODE_NOFILL)
Index: mdoc_validate.c
===================================================================
RCS file: /home/cvs/mandoc/mandoc/mdoc_validate.c,v
retrieving revision 1.378
retrieving revision 1.379
diff -Lmdoc_validate.c -Lmdoc_validate.c -u -p -r1.378 -r1.379
--- mdoc_validate.c
+++ mdoc_validate.c
@@ -1094,21 +1094,32 @@ post_st(POST_ARGS)
 static void
 post_tg(POST_ARGS)
 {
-       struct roff_node        *n, *nch;
+       struct roff_node        *n, *nch, *nn;
        size_t                  len;
 
+       /* Find the next node. */
        n = mdoc->last;
+       for (nn = n; nn != NULL; nn = nn->parent) {
+               if (nn->next != NULL) {
+                       nn = nn->next;
+                       break;
+               }
+       }
+
+       /* Add the default argument, if needed. */
        nch = n->child;
-       if (nch == NULL && n->next != NULL &&
-           n->next->child->type == ROFFT_TEXT) {
+       if (nch == NULL && nn != NULL && nn->child->type == ROFFT_TEXT) {
                mdoc->next = ROFF_NEXT_CHILD;
                roff_word_alloc(mdoc, n->line, n->pos, n->next->child->string);
                nch = mdoc->last;
                nch->flags |= NODE_NOSRC;
                mdoc->last = n;
        }
-       if (nch == NULL || *nch->string == '\0') {
+
+       /* Validate the first argument. */
+       if (nch == NULL || *nch->string == '\0')
                mandoc_msg(MANDOCERR_MACRO_EMPTY, n->line, n->pos, "Tg");
+       if (nch == NULL) {
                roff_node_delete(mdoc, n);
                return;
        }
@@ -1116,14 +1127,42 @@ post_tg(POST_ARGS)
        if (nch->string[len] != '\0')
                mandoc_msg(MANDOCERR_TG_SPC, nch->line, nch->pos + len + 1,
                    "Tg %s", nch->string);
+
+       /* Keep only the first argument. */
        if (nch->next != NULL) {
                mandoc_msg(MANDOCERR_ARG_EXCESS, nch->next->line,
                    nch->next->pos, "Tg ... %s", nch->next->string);
                while (nch->next != NULL)
                        roff_node_delete(mdoc, nch->next);
        }
-       if (nch->string[len] != '\0')
+
+       /* Drop the macro if the first argument is invalid. */
+       if (len == 0 || nch->string[len] != '\0') {
                roff_node_delete(mdoc, n);
+               return;
+       }
+
+       /* By default, write a <mark> element. */
+       n->flags |= NODE_ID;
+       if (nn == NULL)
+               return;
+
+       /* Explicit tagging of specific macros. */
+       switch (nn->tok) {
+       case MDOC_Sh:
+       case MDOC_Ss:
+               if (nn->head->flags & NODE_ID || nn->head->child == NULL)
+                       break;
+               n->flags |= NODE_NOPRT;
+               nn->head->flags |= NODE_ID | NODE_HREF;
+               assert(nn->head->string == NULL);
+               nn->head->string = mandoc_strdup(nch->string);
+               break;
+       default:
+               break;
+       }
+       if (n->flags & NODE_NOPRT)
+               n->flags &= ~NODE_ID;
 }
 
 static void
Index: roff.c
===================================================================
RCS file: /home/cvs/mandoc/mandoc/roff.c,v
retrieving revision 1.370
retrieving revision 1.371
diff -Lroff.c -Lroff.c -u -p -r1.370 -r1.371
--- roff.c
+++ roff.c
@@ -1173,7 +1173,7 @@ deroff(char **dest, const struct roff_no
        char    *cp;
        size_t   sz;
 
-       if (n->type != ROFFT_TEXT) {
+       if (n->string == NULL) {
                for (n = n->child; n != NULL; n = n->next)
                        deroff(dest, n);
                return;
Index: mdoc_term.c
===================================================================
RCS file: /home/cvs/mandoc/mandoc/mdoc_term.c,v
retrieving revision 1.377
retrieving revision 1.378
diff -Lmdoc_term.c -Lmdoc_term.c -u -p -r1.377 -r1.378
--- mdoc_term.c
+++ mdoc_term.c
@@ -117,7 +117,6 @@ static      int       termp_pp_pre(DECL_ARGS);
 static int       termp_ss_pre(DECL_ARGS);
 static int       termp_sy_pre(DECL_ARGS);
 static int       termp_tag_pre(DECL_ARGS);
-static int       termp_tg_pre(DECL_ARGS);
 static int       termp_under_pre(DECL_ARGS);
 static int       termp_vt_pre(DECL_ARGS);
 static int       termp_xr_pre(DECL_ARGS);
@@ -244,7 +243,7 @@ static const struct mdoc_term_act mdoc_t
        { NULL, termp____post }, /* %Q */
        { NULL, termp____post }, /* %U */
        { NULL, NULL }, /* Ta */
-       { termp_tg_pre, NULL }, /* Tg */
+       { termp_skip_pre, NULL }, /* Tg */
 };
 
 static int      fn_prio = TAG_STRONG;
@@ -341,6 +340,10 @@ print_mdoc_node(DECL_ARGS)
        memset(&npair, 0, sizeof(struct termpair));
        npair.ppair = pair;
 
+       if (n->flags & NODE_ID)
+               tag_put(n->string == NULL ? n->child->string : n->string,
+                   TAG_MANUAL, p->line);
+
        /*
         * Keeps only work until the end of a line.  If a keep was
         * invoked in a prior line, revert it to PREKEEP.
@@ -2063,13 +2066,6 @@ termp_tag_pre(DECL_ARGS)
              n->parent->parent->parent->tok == MDOC_It)))
                tag_put(n->child->string, TAG_STRONG, p->line);
        return 1;
-}
-
-static int
-termp_tg_pre(DECL_ARGS)
-{
-       tag_put(n->child->string, TAG_MANUAL, p->line);
-       return 0;
 }
 
 static int
Index: Makefile
===================================================================
RCS file: /home/cvs/mandoc/mandoc/regress/mdoc/Sh/Makefile,v
retrieving revision 1.6
retrieving revision 1.7
diff -Lregress/mdoc/Sh/Makefile -Lregress/mdoc/Sh/Makefile -u -p -r1.6 -r1.7
--- regress/mdoc/Sh/Makefile
+++ regress/mdoc/Sh/Makefile
@@ -1,11 +1,11 @@
-# $OpenBSD: Makefile,v 1.12 2020/02/27 01:25:58 schwarze Exp $
+# $OpenBSD: Makefile,v 1.13 2020/02/27 21:38:27 schwarze Exp $
 
 REGRESS_TARGETS         = badNAME before empty emptyNAME first nohead order
 REGRESS_TARGETS        += orderNAME paragraph parbefore parborder punctNAME
-REGRESS_TARGETS        += subbefore transp
+REGRESS_TARGETS        += subbefore tag transp
 LINT_TARGETS    = badNAME before empty emptyNAME first nohead order
 LINT_TARGETS   += orderNAME parbefore parborder punctNAME subbefore
-HTML_TARGETS    = paragraph
+HTML_TARGETS    = paragraph tag
 
 # groff-1.22.3 defects:
 # - .Pp before .Sh NAME causes a blank line before the header line
--- /dev/null
+++ regress/mdoc/Sh/tag.out_html
@@ -0,0 +1,9 @@
+<p class="Pp">Text in the subsection.</p>
+</section>
+</section>
+<section class="Sh">
+<h1 class="Sh" id="examples"><a class="permalink" 
href="#examples">EXAMPLES</a></h1>
+<p class="Pp">Text introducing examples.</p>
+<section class="Ss">
+<h2 class="Ss" id="example"><a class="permalink" 
href="#example">Subsection</a></h2>
+<p class="Pp">Example text.</p>
--- /dev/null
+++ regress/mdoc/Sh/tag.out_ascii
@@ -0,0 +1,22 @@
+SH-TAG(1)                   General Commands Manual                  SH-TAG(1)
+
+NNAAMMEE
+     SShh--ttaagg - tagging section headers
+
+DDEESSCCRRIIPPTTIIOONN
+     Text in the description.
+
+   SSuubbsseeccttiioonn
+     BEGINTEST
+
+     Text in the subsection.
+
+EEXXAAMMPPLLEESS
+     Text introducing examples.
+
+   SSuubbsseeccttiioonn
+     Example text.
+
+     ENDTEST
+
+OpenBSD                        February 27, 2020                       OpenBSD
--- /dev/null
+++ regress/mdoc/Sh/tag.out_markdown
@@ -0,0 +1,27 @@
+SH-TAG(1) - General Commands Manual
+
+# NAME
+
+**Sh-tag** - tagging section headers
+
+# DESCRIPTION
+
+Text in the description.
+
+## Subsection
+
+BEGINTEST
+
+Text in the subsection.
+
+# EXAMPLES
+
+Text introducing examples.
+
+## Subsection
+
+Example text.
+
+ENDTEST
+
+OpenBSD - February 27, 2020
--- /dev/null
+++ regress/mdoc/Sh/tag.in
@@ -0,0 +1,21 @@
+.\" $OpenBSD: tag.in,v 1.1 2020/02/27 21:38:27 schwarze Exp $
+.Dd $Mdocdate: February 27 2020 $
+.Dt SH-TAG 1
+.Os
+.Sh NAME
+.Nm Sh-tag
+.Nd tagging section headers
+.Sh DESCRIPTION
+Text in the description.
+.Ss Subsection
+BEGINTEST
+.Pp
+Text in the subsection.
+.Tg examples
+.Sh EXAMPLES
+Text introducing examples.
+.Tg example
+.Ss Subsection
+Example text.
+.Pp
+ENDTEST

Reply via email to