Package: htmldoc
Version: 1.9.22-1
Severity: important
Dear Maintainer,
building xfig package fails with htmldoc 1.9.22-1 with a segfault:
make[2]: Entering directory '/build/xfig-3.2.9a/doc'
fig2dev -L png -S4 html/images/xfig-title.fig \
./html/images/xfig-title.png
cd ./html && htmldoc -f ../xfig_ref_en.pdf --no-title \
--webpage --header ..c -t pdf14 --size a4 contents.html \
$(/usr/bin/grep -F '<a href=' contents.html | /usr/bin/sed 's/.*a
href="//; s/["\#].*//;' | uniq | /usr/bin/grep -F -v japanese)
make[2]: *** [Makefile:601: documentation] Segmentation fault (core dumped)
This happens on amd64 as well as on i386, see
https://salsa.debian.org/debian/xfig/-/pipelines/1004335
(I can also reproduce this with my local sid cowbuilder on amd64).
Six days ago there was no issue in the pipeline
https://salsa.debian.org/debian/xfig/-/pipelines/1000175
but this was htmldoc 1.9.21-1.
So I'm pretty sure, that the issue is triggered by the upgrade of
htmldoc from 1.9.21-1 to 1.9.22-1.
Sadly there are no debug symbols available in the package, seem that
it would be a good idea to add
override_dh_auto_configure:
dh_auto_configure -- --enable-debug
to debian/rules, to avoid stripping off the debug symbols (before
Debian can save these to dgbsym-Package).
So I built 1.9.22-1 myself without symbols stripped (on Debian trixie)
and then I was able to extract a backtrace from the segfault:
$ gdb htmldoc /tmp/core
GNU gdb (Debian 16.3-1) 16.3
Copyright (C) 2024 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from htmldoc...
[New LWP 1995518]
This GDB supports auto-downloading debuginfo from the following URLs:
<https://debuginfod.debian.net>
Enable debuginfod for this session? (y or [n]) y
Debuginfod has been enabled.
To make this setting permanent, add 'set debuginfod enabled on' to .gdbinit.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/bin/htmldoc -f
/usr/local/src/Packages/xfig/xfig.git/doc/xfig_ref_en.pdf --no-title --webpage
--header ..c -t pdf14 --size a4 contents.html introduction.html main_menus.html
drawing.html editing.html attributes.html panning.html layers.html
global_settings.html miscellaneous.html fig-format.html i18n.html
latex_and_xfig.html miscellaneous.html accelerators.html options.html
printing.html latex_and_xfig.html printing.html new_features.html
bugs_fixed.html installation.html i18n.html fig-format.html faq.html
authors.html'.
Program terminated with signal SIGSEGV, Segmentation fault.
Download failed: Das Argument ist ungültig. Continuing without source file
./htmldoc/./htmldoc/ps-pdf.cxx.
#0 0x0000561622a8261a in parse_doc (t=0x0, left=0x7ffda4badf94,
right=0x7ffda4badf98, bottom=0x7ffda4badf9c, top=0x7ffda4badfa0,
x=0x7ffda4badf8c, y=0x7ffda4badf90,
page=0x7ffda4badfa4, cpara=0x0, needspace=0x7ffda4badfa8) at
./htmldoc/ps-pdf.cxx:4037
warning: 4037 ./htmldoc/ps-pdf.cxx: Datei oder Verzeichnis nicht gefunden
(gdb) bt
#0 0x0000561622a8261a in parse_doc (t=0x0, left=0x7ffda4badf94,
right=0x7ffda4badf98, bottom=0x7ffda4badf9c, top=0x7ffda4badfa0,
x=0x7ffda4badf8c, y=0x7ffda4badf90,
page=0x7ffda4badfa4, cpara=0x0, needspace=0x7ffda4badfa8) at
./htmldoc/ps-pdf.cxx:4037
#1 0x0000561622a83e2b in parse_doc (t=0x5616403ee450, left=0x7ffda4badf94,
right=<optimized out>, bottom=<optimized out>, top=<optimized out>,
x=<optimized out>,
y=<optimized out>, page=<optimized out>, cpara=<optimized out>,
needspace=<optimized out>) at ./htmldoc/ps-pdf.cxx:4448
#2 0x0000561622a828d6 in parse_doc (t=0x5616403e7610, left=0x7ffda4badf94,
right=<optimized out>, bottom=<optimized out>, top=<optimized out>,
x=<optimized out>,
y=<optimized out>, page=<optimized out>, cpara=<optimized out>,
needspace=<optimized out>) at ./htmldoc/ps-pdf.cxx:4409
#3 0x0000561622a8ba52 in pspdf_export (document=0x5616403081f0, toc=0x0) at
./htmldoc/ps-pdf.cxx:801
#4 0x0000561622a5421f in main (argc=<optimized out>, argv=<optimized out>) at
./htmldoc/htmldoc.cxx:1303
(gdb) bt full
#0 0x0000561622a8261a in parse_doc (t=0x0, left=0x7ffda4badf94,
right=0x7ffda4badf98, bottom=0x7ffda4badf9c, top=0x7ffda4badfa0,
x=0x7ffda4badf8c, y=0x7ffda4badf90,
page=0x7ffda4badfa4, cpara=0x0, needspace=0x7ffda4badfa8) at
./htmldoc/ps-pdf.cxx:4037
i = <optimized out>
doc = <optimized out>
para = <optimized out>
temp = <optimized out>
var = <optimized out>
name = <optimized out>
style = <optimized out>
width = <optimized out>
height = <optimized out>
rgb = {0, 0, 0}
descend = <optimized out>
levels = 2
#1 0x0000561622a83e2b in parse_doc (t=0x5616403ee450, left=0x7ffda4badf94,
right=<optimized out>, bottom=<optimized out>, top=<optimized out>,
x=<optimized out>,
y=<optimized out>, page=<optimized out>, cpara=<optimized out>,
needspace=<optimized out>) at ./htmldoc/ps-pdf.cxx:4448
i = <optimized out>
doc = <optimized out>
para = <optimized out>
temp = <optimized out>
var = <optimized out>
name = <optimized out>
style = <optimized out>
width = <optimized out>
height = <optimized out>
rgb = {0, 0, 0}
descend = false
levels = 2
#2 0x0000561622a828d6 in parse_doc (t=0x5616403e7610, left=0x7ffda4badf94,
right=<optimized out>, bottom=<optimized out>, top=<optimized out>,
x=<optimized out>,
y=<optimized out>, page=<optimized out>, cpara=<optimized out>,
needspace=<optimized out>) at ./htmldoc/ps-pdf.cxx:4409
i = <optimized out>
doc = <optimized out>
para = <optimized out>
temp = <optimized out>
var = <optimized out>
name = <optimized out>
style = <optimized out>
width = <optimized out>
height = <optimized out>
rgb = {0, 0, 0}
descend = false
levels = 2
#3 0x0000561622a8ba52 in pspdf_export (document=0x5616403081f0, toc=0x0) at
./htmldoc/ps-pdf.cxx:801
i = <optimized out>
title_file = <optimized out>
author = 0x0
creator = 0x0
copyright = 0x0
docnumber = <optimized out>
keywords = 0x0
subject = 0x0
lang = 0x0
t = <optimized out>
fp = <optimized out>
x = 0
y = 318.154114
left = 0
right = 487
bottom = 22
top = 748
width = <optimized out>
height = <optimized out>
page = 16
pos = <optimized out>
heading = <optimized out>
toc_duplex = 0
toc_landscape = 0
toc_width = <optimized out>
toc_length = <optimized out>
toc_left = <optimized out>
toc_right = <optimized out>
toc_bottom = <optimized out>
toc_top = <optimized out>
timage = <optimized out>
timage_width = <optimized out>
timage_height = <optimized out>
r = <optimized out>
rgb = {0, 4.61080276e-18, 3.08818156e-41}
needspace = 0
source_date_epoch = <optimized out>
adjust = <optimized out>
image_adjust = 22
temp_adjust = <optimized out>
#4 0x0000561622a5421f in main (argc=<optimized out>, argv=<optimized out>) at
./htmldoc/htmldoc.cxx:1303
i = <optimized out>
j = <optimized out>
document = 0x5616403081f0
file = <optimized out>
toc = 0x0
exportfunc = 0x561622a8b1c0 <pspdf_export(tree_t*, tree_t*)>
extension = <optimized out>
fontsize = <optimized out>
fontspacing = <optimized out>
num_files = 26
start_time = 1767972207.319273
load_time = 1767972207.356056
end_time = <optimized out>
debug = <optimized out>
I tried to identify the offending file and found out, that
drawing.html triggers the segfault, so I can reduce the test bed to
htmldoc -f out.pdf -t pdf14 drawing.html
You find the original drawing.html in
https://salsa.debian.org/debian/xfig/-/blob/debian/latest/doc/html/drawing.html
It seems that this has some faulty HTML, since after running the file
through (html)tidy, it is correctly processed by htmldoc.
Anyway, it's not acceptable that htmldoc fails on faulty HTML with a
segfault.
I was able to reduce the file to the following contents to trigger the segfault:
---------------------- snipp -------------------------------------
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<html>
<head>
<title>Drawing Objects</title>
</head>
<body>
<h1>Drawing Objects</h1>
<dt>
<dd>
</body>
</html>
---------------------- snipp -------------------------------------
It seems that the stand alone "<dd>" (without contents and without
</dd>) triggers the segfault in my case.
I seems that for xfig I can work around this issue by just removing "<dd>"
from line 162 of drawing.html.
But this should be fixed in htmldoc, since segfaults usually are
indicators of security issues.
Hoping, that the above information is good enough to fix or forward
the issue upstream.
Greetings
Roland