Hi,

On 2026-04-28 at 22:28-05:00, G. Branden Robinson wrote:
> This message of yours:
>
> https://lists.gnu.org/archive/html/groff/2026-04/msg00011.html
>
> ...was sufficient to make me notice a problem.  Thank you for the simple
> reproducer!  You've dug more deeply into the nature of the issue than
> I have; I've hacked on GNU eqn a bit but less than the formatter or tbl.
>
> But I agree that this:
>
> $ printf '.EQ\napprox\n.EN\n' | ./eqn -TMathML
> .do if !dEQ .ds EQ
> .do if !dEN .ds EN
> .EQ
> <math><mtext>\(~=</mtext></math>
> .EN
>
> ...is a smoking gun of wrongness, and I can reproduce it.  I'm not
> surprised to observe it in every groff release from 1.22.3 forward, and
> I'd guess it's been a problem with GNU eqn's MathML mode "forever".

This is a different/unrelated issue, where in src/preproc/eqn/lex.cpp
approx is converted to (backtick-quoted for less noise)
`type "relation" "\(~="`, and get_token would treat "\(~="
as a QUOTED_TEXT.  The double-quoting is there to prevent the tilde
from being recognized as a space.

Solving this would require changing the parsing behavior
to allow look-ahead for tilde.  I'm looking into it,
but other than approx, other some other reproducers
are \[~=] \[~~] \[=~].

On 2026-04-28 at 22:28-05:00, G. Branden Robinson wrote:
> I wonder if this issue has (nearly[1]) the same cause as an existing
> Savannah ticket.
>
> https://savannah.gnu.org/bugs/?66592
>
> Could you review that and confirm or refute?

IIUC the floor bug has a different cause,
and so is the patch I'm pinging here:
https://lists.gnu.org/r/groff/2026-03/msg00113.html

On 2026-04-28 at 22:28-05:00, G. Branden Robinson wrote:
> At 2026-04-29T11:56:58+0900, Nguyễn Gia Phong wrote:
> > On 2026-03-26 at 13:48+09:00, Nguyễn Gia Phong wrote:
> > > XML only defines four entities (< > & ") out of the box,
> > > others need to be declared in the document's DOCTYPE.
> > > For web feeds such as RSS and Atom, is is particularly cumbersome
> > > to define the math entities as these feeds are supposed
> > > to be stand-alone and thus the entity definitions have to be inlined.
> > >
> > > Therefore, character references are now used
> > > instead of entity references, making the MathML output
> > > directly embeddable into these feeds.  The entity table
> > > is no longer used and thus removed.
> > >
> > > * src/preproc/eqn/text.cpp: Remove struct map, entity_table,
> > >   and special_to_entity.  Include "unicode.h" header file.
> > >   (special_char_box::output): Instead of named entity reference,
> > >   print XML character reference with Unicode codepoint for MathML.
> > >   Add support for Unicode code sequence as an input character.
> > >
> > > References: https://www.w3.org/TR/REC-xml/#sec-references
>
> if I understand your analysis
> correctly, you're probably on the right track: the special character
> rewriting table that GNU eqn uses for MathML mode is flat wrong, [so] copying
> or reusing the one for troff output.

I'm not claiming it is flat wrong or buggy, but rather
it has a rather limited usefulness, as the generated MathML
*cannot* be embedded in web feeds.

I assume that very few people make use of the MathML output,
hence many problems remain undetected.

Cheers,
Phong

Attachment: signature.asc
Description: PGP signature

  • ... Nguyễn Gia Phong via discussion of the GNU roff typesetting system and related software
    • ... Nguyễn Gia Phong via discussion of the GNU roff typesetting system and related software
      • ... Nguyễn Gia Phong via discussion of the GNU roff typesetting system and related software
        • ... Nguyễn Gia Phong via discussion of the GNU roff typesetting system and related software
          • ... Nguyễn Gia Phong via discussion of the GNU roff typesetting system and related software
          • ... G. Branden Robinson
            • ... Nguyễn Gia Phong via discussion of the GNU roff typesetting system and related software
          • ... Nguyễn Gia Phong via discussion of the GNU roff typesetting system and related software
            • ... G. Branden Robinson
              • ... Nguyễn Gia Phong via discussion of the GNU roff typesetting system and related software

Reply via email to