URL: <https://savannah.gnu.org/bugs/?67133>
Summary: [groff] pipeline handling conceals fate of commands that die by SIGPIPE Group: GNU roff Submitter: gbranden Submitted: Sat 17 May 2025 02:00:53 AM GMT Category: Core Severity: 4 - Important Item Group: Incorrect behaviour Status: In Progress Privacy: Public Assigned to: gbranden Open/Closed: Open Discussion Lock: Any Planned Release: None _______________________________________________________ Follow-up Comments: ------------------------------------------------------- Date: Sat 17 May 2025 02:00:53 AM GMT By: G. Branden Robinson <gbranden> This has been a super-annoying problem for a long time, and I believe it has played a role in frustrating _grohtml_ development. Now that I've got a series of commits ready it's easier to just quote them. commit 21d5356b77b48a3e4ce2231698fec856603c2489 Author: G. Branden Robinson <g.branden.robin...@gmail.com> Date: Fri May 16 19:24:38 2025 -0500 [troff]: Add assertions (INSTALL FAILS). * src/roff/troff/node.cpp (real_output_file::on, real_output_file::off): Add `assert()`ions of complementary Boolean state before unconditionally assigning the other. One of them fails, in a quiet and underhanded way, when using grohtml to format "doc/pic.ms", which scandalously doesn't provoke a build failure. No automated tests trip it, either. A document in the wild could conceivably trip either. If one does, we want to hear about it, preferably with a core file generated by an unstripped troff executable. I have a fairly hard rule about not pushing commits that break the build. Technically this doesn't. groff _builds_ just fine. But if you try to install it, it will fail: .../share/doc/groff-1.23.0/html/img /usr/bin/install: cannot stat '.../GIT/groff/build/doc/img/pic*': No such file or directory ...and sure enough, the pic-1.png and other image files to be inlined into the generated HTML files aren't there. Why aren't they there? Why didn't the build fail earlier? Why didn't groff throw a fit when grohtml couldn't generate the desired output? ("Why aren't you over there stomping Private Pyle's guts out?!") Two reasons. 1. src/roff/groff/pipeline.c appears to have a 33-year-old bug in it. 2. Our doc/doc.am script doesn't try hard enough to detect failure. Over the past 8 years, I've on rare occasions seen mysterious failures from grohtml, usually when formatting pic.ms, but they've always neglected to leave behind evidence of the crime (like a core dump), and worse, like in this case, usually weren't detectable if all you did was build everything, as opposed to trying to install it. Now at last I think I have my fingers around the throat of the problem. In the next commits I'll address the two issues above. ChangeLog | 12 ++++++++++++ src/roff/troff/node.cpp | 6 ++++-- 2 files changed, 16 insertions(+), 2 deletions(-) commit 74b8bf5a1a9d60c38cfa4160f328ac20497cf731 Author: G. Branden Robinson <g.branden.robin...@gmail.com> Date: Fri May 16 08:04:22 2025 -0500 Detect grohtml failure better (BUILD FAILS). * doc/doc.am (doc/pic.html, doc/webpage.html): Create output as a ".tmp"-suffixed file at first, then check for the first of multiple associated image files we know should exist, moving the target into place only if it does. Per the previous commit, now the build truly fails when "groff -Thtml" does. But only because we're looking for files that should exist, but don't. Why is groff exiting with status 0? For that, we need another change. ChangeLog | 9 +++++++++ doc/doc.am | 10 +++++++--- 2 files changed, 16 insertions(+), 3 deletions(-) commit 9a3dd212cfbaa6ab5238c4de10848e32991c6c13 Author: G. Branden Robinson <g.branden.robin...@gmail.com> Date: Fri May 16 19:49:33 2025 -0500 [groff]: Fail if piped cmd signaled (BUILD FAILS). * src/roff/groff/pipeline.c (run_pipeline) [!_WIN32 && !__MSDOS__ && !_UWIN && !__CYGWIN__ && !__EMX__]: _Unconditionally_ set bit 2 of the return value in the event of _any_ signal hitting a pipelined process. A workaround that James Clark put in for a SunOS 4.1.1 X11-related bug in groff 1.06 (1992) appears to have led us to grief; if any of the child processes in the pipeline being wait(2)ed on was signaled with SIGPIPE, this alteration of return the value was not being done. If that child was the last in the pipeline (gxditview, the code presumes), he walks the list of commands in the pipeline and kill(2)s them all with SIGPIPE. If a process has multiple fatal signals pending, which one wins? Apparently, on Linux 5.10, in a fight between SIGPIPE and SIGABRT (raised by abort(3), called by assert(3)), SIGPIPE always wins. So bit 2 of this function's return value, which (after a left shift) ultimately becomes groff(1)'s exit status, never got set, and so groff happily reported success when it should have screamed hideously of failure. Likely the SunOS 4 workaround should be ripped out entirely, but this fix adequately detects grohtml failures. I experimentally instrumented "pipeline.c" a bit, which helped me track down the issue. The following was as good as a smoking gun to me. GBR: Unix run_pipeline() GBR: PID 54503 caught signal 13 GBR: run_pipeline() returning 0 ChangeLog | 24 ++++++++++++++++++++++++ src/roff/groff/pipeline.c | 8 +++----- 2 files changed, 27 insertions(+), 5 deletions(-) commit 718cb68badf5fecadec1bcea7328c02767651a0b Author: G. Branden Robinson <g.branden.robin...@gmail.com> Date: Fri May 16 20:44:59 2025 -0500 [troff]: Comment out assertion. * src/roff/troff/node.cpp (real_output_file::on): Comment out assertion that fails when formatting "pic.ms" as HTML. ChangeLog | 5 +++++ src/roff/troff/node.cpp | 2 +- 2 files changed, 6 insertions(+), 1 deletion(-) _______________________________________________________ Reply to this item at: <https://savannah.gnu.org/bugs/?67133> _______________________________________________ Message sent via Savannah https://savannah.gnu.org/
signature.asc
Description: PGP signature