Hi, On 2026-04-28 at 22:28-05:00, G. Branden Robinson wrote: > This message of yours: > > https://lists.gnu.org/archive/html/groff/2026-04/msg00011.html > > ...was sufficient to make me notice a problem. Thank you for the simple > reproducer! You've dug more deeply into the nature of the issue than > I have; I've hacked on GNU eqn a bit but less than the formatter or tbl. > > But I agree that this: > > $ printf '.EQ\napprox\n.EN\n' | ./eqn -TMathML > .do if !dEQ .ds EQ > .do if !dEN .ds EN > .EQ > <math><mtext>\(~=</mtext></math> > .EN > > ...is a smoking gun of wrongness, and I can reproduce it. I'm not > surprised to observe it in every groff release from 1.22.3 forward, and > I'd guess it's been a problem with GNU eqn's MathML mode "forever".
This is a different/unrelated issue, where in src/preproc/eqn/lex.cpp approx is converted to (backtick-quoted for less noise) `type "relation" "\(~="`, and get_token would treat "\(~=" as a QUOTED_TEXT. The double-quoting is there to prevent the tilde from being recognized as a space. Solving this would require changing the parsing behavior to allow look-ahead for tilde. I'm looking into it, but other than approx, other some other reproducers are \[~=] \[~~] \[=~]. On 2026-04-28 at 22:28-05:00, G. Branden Robinson wrote: > I wonder if this issue has (nearly[1]) the same cause as an existing > Savannah ticket. > > https://savannah.gnu.org/bugs/?66592 > > Could you review that and confirm or refute? IIUC the floor bug has a different cause, and so is the patch I'm pinging here: https://lists.gnu.org/r/groff/2026-03/msg00113.html On 2026-04-28 at 22:28-05:00, G. Branden Robinson wrote: > At 2026-04-29T11:56:58+0900, Nguyễn Gia Phong wrote: > > On 2026-03-26 at 13:48+09:00, Nguyễn Gia Phong wrote: > > > XML only defines four entities (< > & ") out of the box, > > > others need to be declared in the document's DOCTYPE. > > > For web feeds such as RSS and Atom, is is particularly cumbersome > > > to define the math entities as these feeds are supposed > > > to be stand-alone and thus the entity definitions have to be inlined. > > > > > > Therefore, character references are now used > > > instead of entity references, making the MathML output > > > directly embeddable into these feeds. The entity table > > > is no longer used and thus removed. > > > > > > * src/preproc/eqn/text.cpp: Remove struct map, entity_table, > > > and special_to_entity. Include "unicode.h" header file. > > > (special_char_box::output): Instead of named entity reference, > > > print XML character reference with Unicode codepoint for MathML. > > > Add support for Unicode code sequence as an input character. > > > > > > References: https://www.w3.org/TR/REC-xml/#sec-references > > if I understand your analysis > correctly, you're probably on the right track: the special character > rewriting table that GNU eqn uses for MathML mode is flat wrong, [so] copying > or reusing the one for troff output. I'm not claiming it is flat wrong or buggy, but rather it has a rather limited usefulness, as the generated MathML *cannot* be embedded in web feeds. I assume that very few people make use of the MathML output, hence many problems remain undetected. Cheers, Phong
signature.asc
Description: PGP signature
