Bug#901636: mandoc: "mandoc -mdoc -T man" causes memory dump

2021-07-04 Thread Ingo Schwarze
I just fixed the assertion failure upstream at mandoc.bsd.lv,
see https://inbox.vuxu.org/mandoc-source/c2aa13aac88a7...@mandoc.bsd.lv/
and http://cvsweb.bsd.lv/mandoc/ .

Supporting tbl(7) and eqn(7) in -T man has been on the
http://cvsweb.bsd.lv/mandoc/TODO list for some time, but it causes
a non-trivial amount of work and is not particularly high priority
for the following reason: The main use case for -T man is that you
can maintain the documentation of your portable software project
in the mdoc(7) format and yet provide autogenerated man(7) versions
of your manual pages for the very few remaining operating systems
that still do not support mdoc(7).  If you care about that, embedding
tbl(7) or eqn(7) code in your manual pages is just a bad idea in
the first place.

Thanks to Nab for proposing patches, but these can't be committed
as-is because they consitute a layering violation.  The new code
is essentially a tbl-to-tbl output mode and would belong into a new
module tbl_tbl.c; it is misplaced in mdoc_man.c because it is neither
related to mdoc(7) input nor to man(7) output.  It appears setting
that up properly wouldn't be excessively difficult, but i don't
have the time to do so right now.  Besides, this would be the first
src_dst.c output module where src == dst, so some generic design
questions might need to be considered before commit.

Thanks to all of you for your input!
  Ingo



Bug#901636: mandoc: "mandoc -mdoc -T man" causes memory dump

2021-07-03 Thread наб
Hi, sorry, me again, this also affects .TS requests in mdoc mode,
and minimises to:
-- >8 --
.TS
l .
a
.TE
-- >8 --
which dies as such:
-- >8 --
$ { echo '.TS'; echo 'l .'; echo 'a'; echo '.TE'; } | mandoc -Tman -mdoc
.\" Automatically generated from an mdoc input file.  Do not edit.
.TH "UNTITLED" "" "July 3, 2021" "" "LOCAL"
.nh
mandoc: mdoc_man.c:300: mdoc_man_act: Assertion `tok >= MDOC_Dd && tok <= 
MDOC_MAX' failed.
Aborted
-- >8 --
(original and minimal pages attached)

Attaching GDB yields, rather similarly,
  tok = TOKEN_NONE, type = ROFFT_TBL,

Would it make sense to pass tbl(7) and eqn(7) through,
like it's done in man input mode?

Best,
наб
.TS
l .
a
.TE
.\" SPDX-License-Identifier: 0BSD
.\"
.Dd
.Dt SHA1SUM 1
.Os
.
.Sh NAME
.Nm sha*sum , b2sum , md5sum
.Nd generate or verify cryptographic hashes
.Sh SYNOPSIS
.Nm sha1sum
.Op Fl ztb
.Op Fl -tag
.Oo Ar file Oc Ns …
.Nm sha224sum
.Op Fl ztb
.Op Fl -tag
.Oo Ar file Oc Ns …
.Nm sha256sum
.Op Fl ztb
.Op Fl -tag
.Oo Ar file Oc Ns …
.Nm sha384sum
.Op Fl ztb
.Op Fl -tag
.Oo Ar file Oc Ns …
.Nm sha512sum
.Op Fl ztb
.Op Fl -tag
.Oo Ar file Oc Ns …
.Nm b2sum
.Op Fl ztb
.Op Fl l Ar BITS
.Op Fl -tag
.Oo Ar file Oc Ns …
.Nm md5sum
.Op Fl ztb
.Op Fl -tag
.Oo Ar file Oc Ns …
.Pp
.Nm sha1sum
.Fl c
.Op Fl w
.Op Fl -strict
.Op Fl -quiet
.Op Fl -status
.Op Fl -ignore-missing
.Oo Ar file Oc Ns …
.Nm sha224sum
.Fl c
.Op Fl w
.Op Fl -strict
.Op Fl -quiet
.Op Fl -status
.Op Fl -ignore-missing
.Oo Ar file Oc Ns …
.Nm sha256sum
.Fl c
.Op Fl w
.Op Fl -strict
.Op Fl -quiet
.Op Fl -status
.Op Fl -ignore-missing
.Oo Ar file Oc Ns …
.Nm sha384sum
.Fl c
.Op Fl w
.Op Fl -strict
.Op Fl -quiet
.Op Fl -status
.Op Fl -ignore-missing
.Oo Ar file Oc Ns …
.Nm sha512sum
.Fl c
.Op Fl w
.Op Fl -strict
.Op Fl -quiet
.Op Fl -status
.Op Fl -ignore-missing
.Oo Ar file Oc Ns …
.Nm b2sum
.Fl c
.Op Fl l Ar BITS
.Op Fl w
.Op Fl -strict
.Op Fl -quiet
.Op Fl -status
.Op Fl -ignore-missing
.Oo Ar file Oc Ns …
.Nm md5sum
.Fl c
.Op Fl w
.Op Fl -strict
.Op Fl -quiet
.Op Fl -status
.Op Fl -ignore-missing
.Oo Ar file Oc Ns …
.
.Sh DESCRIPTION
Without
.Fl c ,
hash
.Ar file Ns s Pq the standard input stream if Qo Sy - Qc , the default ,
writing the hashes to the standard output stream.
With
.Fl c ,
verify a hash set from
.Ar file Ns s Pq likewise ,
against on-disk state.
.Pp
.TS
lb lb lb lb
l l l l l .
Program Digest  StandardLength
_
\fBsha1sum\fP   SHA-1   FIPS PUB 180160 bitsavoid use in security 
applications
\fBsha224sum\fP, \fBsha256sum\fP,
\fBsha384sum\fP, \fBsha512sum\fPSHA-2   FIPS PUB 180-2  respective
\fBb2sum\fP BLAKE2b RFC 76938-512 bits
\fBmd5sum\fPMD5 RFC 1321128 bits\fINOT\fP suitable for 
any application
.TE
.
.Ss Generation
When run without
.Fl c
nor
.Fl -tag ,
a listing is produced in the following form:
.Bd -literal -compact -offset Ds
.Li $ Nm sha1sum Li README.md
.Li 22eb73bd30d47540a4e05781f0f6e07640857cae Sy "\ " Ns Ar LICENCE
.Ed
With
.Fl b ,
an asterisk replaces the second space:
.Bd -literal -compact -offset Ds
.Li $ Nm echo Qo POSIX.2 Qc Sy \&| Nm sha1sum Fl b
.Li 33210013f52127d6ada425601f16bbb62e85f3be Sy "*" Ns Ar -
.Ed
.Pp
With
.Fl -tag ,
the output is in the form
.Bd -literal -compact -offset Ds
.Li $ Nm echo Qo POSIX.2 Qc Sy \&| Nm sha1sum Fl -tag Li LICENCE -
.Sy SHA1 Ns Li \&( Ns Ar LICENCE Ns Li )\& Sy = Li 
bd25664d6e803060dcb31bfdd9642ba9d8a3f1b9
.Sy SHA1 Ns Li \&( Ns Ar - Ns Li )\& Sy = Li 
33210013f52127d6ada425601f16bbb62e85f3be
.Ed
A dash and width in bits is appended for non-default digest lengths:
.Bd -literal -compact -offset Ds
.Li $ Nm echo Qo POSIX.2 Qc Sy \&| Nm b2sum Fl -tag l Ar 96
.Sy BLAKE2b- Ns Ar 96 Ns Li \&( Ns Ar - Ns Li )\& Sy = Li 
386b90bea2a1e566249cdb96
.Ed
.Pp
If the filename contains a backslash or newline characters, they're replaced 
with
.Qq Sy \e\e
and
.Qq Sy \en ,
respectively, and a backslash is prepended to the line:
.Bd -literal -compact -offset Ds
.Li $ Nm echo Li 'trademark of AT&T' Sy > Li \&"$(printf 'UNIX\enregistered')"
.Li $ Nm sha1sum Li \&"$(printf 'UNIX\enregistered')"
.Sy \e Ns Li 7390a4a0bfb7c6da55d6f5f3db4e42c534271363 Sy "\ " Ns Ar 
UNIX\enregistered
.Li $ Nm sha1sum Fl -tag Li \&"$(printf 'UNIX\enregistered')"
.Sy \eSHA1 Ns Li \&( Ns Ar UNIX\enregistered Ns Li )\& Sy = Li 
7390a4a0bfb7c6da55d6f5f3db4e42c534271363
.Ed
With
.Fl z ,
lines are separated by NUL bytes and escaping does not occur.
.
.Ss Verification
With
.Fl c ,
the
.Ar file Ns s
are instead the output in the default format:
lines consisting of a nonzero, even amount of hexadecimal digits (any case), a 
space, a space or an asterisk and a filename
(if the first character of a line is a backslash, it's skipped and the filename 
is de-escaped).
With
.Fl w ,
an error is written to the standard error stream for each invalid line.
With
.Fl -strict ,
invalid lines yield a non-zero exit code.
Be wary of using
.Qq Sy -
.Ar file Ns s
and
.Qq Sy -
hashed files.
.Pp
For each valid line, the file is hashed, c

Bug#901636: mandoc: "mandoc -mdoc -T man" causes memory dump

2021-06-07 Thread наб
found 901636 1.14.5-1
thanks

Hi!

I think I hit this as well, but I've a minimal example to show for it:
-- >8 --
.Dd June  7, 2021
.Dt SLEEP 1
.Os
.EQ
Z = X + Y
.EN
-- >8 --

Running "mandoc -Tman" on this file produces some output until it
explodes on the .EQ directive (I think? an equivalent thing happens on
the unminimised version (attached)):
-- >8 --
.\" Automatically generated from an mdoc input file.  Do not edit.
.TH "SLEEP" "1" "June 7, 2021" "Debian" "General Commands Manual"
.nh
mandoc: mdoc_man.c:686: print_node: Assertion `n->tok >= MDOC_Dd && n->tok < 
MDOC_MAX' failed.
Aborted
-- >8 --

Happens on buster (1.14.4-1) and sid (1.14.5-1).

GDB has this to say about it:
-- >8 --
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0xf7e0c515 in __GI_abort () at abort.c:79
#2  0xf7e0c3fe in __assert_fail_base (fmt=, assertion=, file=, line=300,
function=) at assert.c:92
#3  0xf7e1b1ff in __GI___assert_fail (assertion=assertion@entry=0x565a4c88 "tok 
>= MDOC_Dd && tok <= MDOC_MAX",
file=file@entry=0x565a4bba "mdoc_man.c", line=line@entry=300,
function=function@entry=0x565a4b88 <__PRETTY_FUNCTION__.2> "mdoc_man_act") 
at assert.c:101
#4  0x5657f29f in mdoc_man_act (tok=) at mdoc_man.c:300
#5  0x5657f4e8 in mdoc_man_act (tok=) at mdoc_man.c:689
#6  print_node (meta=meta@entry=0x565c6600, n=n@entry=0x565cc610) at 
mdoc_man.c:688
#7  0x56580a4a in man_mdoc (arg=, mdoc=mdoc@entry=0x565c6600) at 
mdoc_man.c:634
#8  0x5657d8cc in parse (curp=curp@entry=0xd450, fd=fd@entry=4, 
file=file@entry=0xd786 "../sleep.1-min")
at main.c:859
#9  0x565613bf in main (argc=, argv=) at 
main.c:540

(gdb) up
#6  print_node (meta=meta@entry=0x565c6600, n=n@entry=0x565cc610) at 
mdoc_man.c:688
(gdb) p n
$2 = (struct roff_node *) 0x565cc610
(gdb) p *n
$3 = {parent = 0x565c6670, child = 0x0, last = 0x0, next = 0x0, prev = 
0x565cc590, head = 0x0, body = 0x0, tail = 0x0,
  args = 0x0, norm = 0x0, string = 0x0, span = 0x0, eqn = 0x565cc670, line = 4, 
pos = 1, flags = 9, prev_font = 0,
  aux = 0, tok = TOKEN_NONE, type = ROFFT_EQN, sec = SEC_NONE, end = 
ENDBODY_NOT}
-- >8 --

n->tok is what's being passed to mdoc_man_act() which produces the
assertion, and TOKEN_NONE is just below MDOC_Md, according to roff.h:
-- >8 --
316 ROFF_cblock,
317 ROFF_RENAMED,
318 ROFF_USERDEF,
319 TOKEN_NONE,
320 MDOC_Dd,
321 MDOC_Dt,
322 MDOC_Os,
-- >8 --

The entire document(?) seems to be
-- >8 --
(gdb) p mdoc->first
$14 = (struct roff_node *) 0x565c6670
(gdb) p *mdoc->first
$15 = {parent = 0x0, child = 0x565cc260, last = 0x565cc610, next = 0x0, prev = 
0x0, head = 0x0, body = 0x0, tail = 0x0,
  args = 0x0, norm = 0x0, string = 0x0, span = 0x0, eqn = 0x0, line = 0, pos = 
0, flags = 3, prev_font = 0, aux = 0,
  tok = TOKEN_NONE, type = ROFFT_ROOT, sec = SEC_NONE, end = ENDBODY_NOT}
(gdb) p *mdoc->first->child
$16 = {parent = 0x565c6670, child = 0x565cc2d0, last = 0x565cc3b0, next = 
0x565cc430, prev = 0x0, head = 0x0,
  body = 0x0, tail = 0x0, args = 0x0, norm = 0x0, string = 0x0, span = 0x0, eqn 
= 0x0, line = 1, pos = 1, flags = 1035,
  prev_font = 0, aux = 0, tok = MDOC_Dd, type = ROFFT_ELEM, sec = SEC_NONE, end 
= ENDBODY_NOT}
(gdb) p *mdoc->first->child->next
$17 = {parent = 0x565c6670, child = 0x565cc4a0, last = 0x565cc510, next = 
0x565cc590, prev = 0x565cc260, head = 0x0,
  body = 0x0, tail = 0x0, args = 0x0, norm = 0x0, string = 0x0, span = 0x0, eqn 
= 0x0, line = 2, pos = 1, flags = 1035,
  prev_font = 0, aux = 0, tok = MDOC_Dt, type = ROFFT_ELEM, sec = SEC_NONE, end 
= ENDBODY_NOT}
(gdb) p *mdoc->first->child->next->next
$18 = {parent = 0x565c6670, child = 0x0, last = 0x0, next = 0x565cc610, prev = 
0x565cc430, head = 0x0, body = 0x0,
  tail = 0x0, args = 0x0, norm = 0x0, string = 0x0, span = 0x0, eqn = 0x0, line 
= 3, pos = 1, flags = 1035,
  prev_font = 0, aux = 0, tok = MDOC_Os, type = ROFFT_ELEM, sec = SEC_NONE, end 
= ENDBODY_NOT}
(gdb) p *mdoc->first->child->next->next->next
$19 = {parent = 0x565c6670, child = 0x0, last = 0x0, next = 0x0, prev = 
0x565cc590, head = 0x0, body = 0x0, tail = 0x0,
  args = 0x0, norm = 0x0, string = 0x0, span = 0x0, eqn = 0x565cc670, line = 4, 
pos = 1, flags = 9, prev_font = 0,
  aux = 0, tok = TOKEN_NONE, type = ROFFT_EQN, sec = SEC_NONE, end = 
ENDBODY_NOT}
(gdb) p *mdoc->first->child->next->next->next->next
Cannot access memory at address 0x0
-- >8 --

Which is likely to be produced by roff.c#roff_EQ(), which seems to be
one of the very few spots where a TOKEN_NONE is actively produced,
and the only one where ROFFT_EQN is:
-- >8 --
   3297 static int
   3298 roff_EQ(ROFF_ARGS)
   3299 {
   3300 struct roff_node*n;
   3301
   3302 if (r->man->meta.macroset == MACROSET_MAN)
   3303 man_breakscope(r->man, ROFF_EQ);
   3304 n = roff_node_alloc(r->man, ln, ppos, ROFFT_EQN, TOKEN_NONE);
   3305 if (ln > r->man->last->line)
   3306 n->flags |= NODE_LINE;

Bug#901636: mandoc: "mandoc -mdoc -T man" causes memory dump

2019-03-01 Thread Bdale Garbee
Bjarni Ingi Gislason  writes:

> On Fri, Jul 27, 2018 at 03:22:03AM -0600, Bdale Garbee wrote:
>> Bjarni Ingi Gislason  writes:
>> 
>> >   Output from "mandoc -mdoc -T man groff_mdoc.7":
>> 
>> Where is "groff_mdoc.7" from?  I have no such file on my system, so
>> there's no way for me to try and reproduce this error.
>> 
>
>   It is in the "groff" package.

Thanks.  I can confirm that I see this error on this source file, too,
with mdocml version 1.14.4-1.

I do not plan to pursue this further, though, since I don't actually use
mdocml any more.  Michael Stapelberg has been taking care of the package
since Feb 2017... I just realized we never actually updated the
Maintainer assertion in the package metadata to reflect that, though.
I've just made that update in the packaging repository for the next upload.

Regards,

Bdale


signature.asc
Description: PGP signature


Bug#901636: mandoc: "mandoc -mdoc -T man" causes memory dump

2019-03-01 Thread Bjarni Ingi Gislason
On Fri, Jul 27, 2018 at 03:22:03AM -0600, Bdale Garbee wrote:
> Bjarni Ingi Gislason  writes:
> 
> >   Output from "mandoc -mdoc -T man groff_mdoc.7":
> 
> Where is "groff_mdoc.7" from?  I have no such file on my system, so
> there's no way for me to try and reproduce this error.
> 

  It is in the "groff" package.

-- 
Bjarni I. Gislason



Bug#901636: mandoc: "mandoc -mdoc -T man" causes memory dump

2018-07-27 Thread Bdale Garbee
Bjarni Ingi Gislason  writes:

>   Output from "mandoc -mdoc -T man groff_mdoc.7":

Where is "groff_mdoc.7" from?  I have no such file on my system, so
there's no way for me to try and reproduce this error.

Bdale


signature.asc
Description: PGP signature


Bug#901636: mandoc: "mandoc -mdoc -T man" causes memory dump

2018-06-15 Thread Bjarni Ingi Gislason
Package: mandoc
Version: 1.14.3-4
Severity: important

  Output from "mandoc -mdoc -T man groff_mdoc.7":

mandoc: mdoc_man.c:678: print_node: Assertion `n->tok >= MDOC_Dd && n->tok < 
MDOC_MAX' failed.
Aborted (core dumped)

-- System Information:
Debian Release: buster/sid
  APT prefers stable-updates
  APT policy: (500, 'stable-updates'), (500, 'proposed-updates'), (500, 
'testing'), (500, 'stable')
Architecture: amd64 (x86_64)

Kernel: Linux 4.9.88-1-u1 (SMP w/2 CPU cores)
Locale: LANG=is_IS.iso88591, LC_CTYPE=is_IS.iso88591 (charmap=ISO-8859-1), 
LANGUAGE=is_IS.iso88591 (charmap=ISO-8859-1)
Shell: /bin/sh linked to /bin/dash
Init: sysvinit (via /sbin/init)

Versions of packages mandoc depends on:
ii  libc6   2.27-3
ii  zlib1g  1:1.2.11.dfsg-1

mandoc recommends no packages.

mandoc suggests no packages.

-- no debconf information

-- 
Bjarni I. Gislason