Follow-up Comment #6, bug #67971 (group groff):

At 2026-01-31T12:31:13-0500, Deri James wrote:
> Follow-up Comment #5, bug #67971 (group groff):
>
> [comment #4 comment #4:]
>> At 2026-01-29T08:50:30-0500, Deri James wrote:
>>
>> I don't now remember, and when I install third-party fonts I always
>> rip them out again afterwards to keep my test environment "clean".
>>
>> So I'll have to stand up an experimental situation again.
>>
>> Or some brave volunteer could do so.  :)
>
> Given that Alexis reported problems using the new fonts with -Tpdf,
> I'd particularly like to know how you managed to get them to work
> before you committed them.

I recall duplicating TANAKA Takuji's results, on the "ps" device, using
_precisely_ the fonts he prescribed.

See bug #62830.

And then I uninstalled and discarded those fonts because I didn't want
them to confound any results I got while developing groff.  For similar
reasons, I avoid messing with the Tinos fonts that you've frequently
employed.

Would I like to have had an automated test for this?  Oh, hell, yes--but
we _can't_ perform automated testing of this functionality unless the
danged old fonts are installed (and configured for support
("Foundry"/"download" stuff), which means a dependency, and given the
grief the URW fonts give us already and the rapidly ramifying
configuration combinations that result--should we have "basic",
"intermediate", and "full" support for CJK fonts as well?

For relevant background, see

https://www.usenix.org/legacy/publications/library/proceedings/sa92/spencer.pdf

So the 3 gropdf configurations requiring testing would become nine:

basic URW + basic CJK
basic URW + intermediate CJK
basic URW + full CJK
intermediate URW + basic CJK
intermediate URW + intermediate CJK
intermediate URW + full CJK
full URW + basic CJK
full URW + intermediate CJK
full URW + full CJK

...except maybe there'd be no such thing as "basic CJK" support since
the base 14 fonts of the PDF standard aren't specified to have any CJK
glyph coverage.  That cuts the number of configurations down to six.

basic URW + intermediate CJK
basic URW + full CJK
intermediate URW + intermediate CJK
intermediate URW + full CJK
full URW + intermediate CJK
full URW + full CJK

...and maybe "intermediate CJK" is a poor choice of term, because we
don't have any font metrics we can offer the user in the first place as
we can and do with the Latin script (based on the PostScript level 2
fonts from the Adobe foundry).

And as you point out below, even if we _had_ metrics, any given
character might be missing, so we have no business advertising glyph
availability.  The user should figure these matters out for themselves.

That would knock us back down to three configurations, which I guess is
the correct number.  (Except that it should be more difficult to select
only "basic" service, and there should be lots of build warnings when it
_is_ selected, to maximize user unease with their refusal to exercise
_all_ of gropdf's features.)

> I was unable to get them to work with any devices except html
> and utf8,

That leaves just "ps" and "pdf".

> where even this works:-
>
> printf ".ft ZR\nハローワールド"| test-groff -Tutf8 -Kutf8 |less
>
> , but it could be just me using it wrong.

So, let me get this straight: you _don't_ get the following warning?


troff:<standard input>:1: warning: cannot select font 'ZR'


I don't understand the point you're trying to make.

How is your example above significantly different from this one?


$ printf ".ft BOGUS\nYou should not be able to read this because I selected a
nonexistent font."| ./build/test-groff -Tutf8 -Kutf8 | cat -s
troff:<standard input>:1: warning: cannot select font 'BOGUS'
You should not be able to read this because I selected a nonexis‐
tent font.


The presumption in the sample text is false.

When the input selects a nonexistent font, no font change takes place.
If the currently selected font can resolve the glyphs requested for
formatting, you get them, and if it can't, you don't.

> Which is why I think we need documentation on usage for each device.

I guess you place more value on various output drivers behaving
differently than I do.

Alternatively, we could rip the feature out.

I don't know what the font "ZR" _is_, but maybe that's your point--that
font selections don't mean the same sort of thing with nroff-mode output
drivers, which grotty certainly is, and grohtml arguably is.

But I'd say they do--see above.  What's lacking is support for
"families" (and type size changes, not relevant here).

> And the new "charset-range" command is telling porkies to troff,

Is that a problem with the feature _existing_, or a (potential or
actual) problem with the accuracy of font description files?

> since whatever "real" font is used to get the glyph definitions
> required for output is unlikely to have all the glyphs the
> charset-ranges claimed for the font, so rather than telling the user:-

Sounds like we should revert this feature.  Any given that a CJK font,
even when grouped in language-specific families as TANAKA Takuji's patch
did, might be missing glyph coverage even for the most common characters
in that language.

I'll take this opportunity to rebut your point that Times = Mincho and
Helvetica = Gothic.  While that may be (approximately) true for Japanese
and Korean, it is (apparently) not true for Chinese.  Font sinologists
use the terms "hei" and "song" instead.

> [derij@pip build (master)]$ printf ".ft Ryumin\n\[u31F0]"|test-groff -Tpdf -F
> ~/.groff/fonts -z
> troff:<standard input>:2: warning: special character 'u31F0' not defined
>
> You see:-
>
> [derij@pip build (master)]$ printf ".ft JPM\n\[u31F0]"|test-groff -Tpdf -F
> ~/.groff/fonts -z
>
> The reason is because the JPM groff font says:-
>
> #
> #  Japanese, Mincho style
> #  Adobe-Japan1
> #
>
> name JPM
> internalname Ryumin-Light-UniJIS-UTF16-H
> spacewidth 250
>
> charset-range
> [...]
> u31F0..u31FF    1000    0       --- Katakana Phonetic Extensions
>
> So groff thinks that u31F0 (ㇰ KATAKANA LETTER SMALL KU) is available
> to use even if the actual glyph for that character is not contained in
> the "real" font. This is new behaviour, previously troff would never
> 'Cu31F0' if the font you were using did not contain the glyph. I'm
> unsure what typesetting output drivers would do in this case, and
> since I can't get grops to work at all with these fonts, I can't
> investigate!

Not quite seeing what the emergency is--you didn't make time at any
point from 28 July 2022 until the past week or so.

> Waiting for your instructions as to what I have to do to duplicate
> your testing.

I appreciate your frustration with getting things tested.  Perhaps you
begin to perceive the utility of having automated scripts to perform it.

I have a proposal.

How about you give me a list of everything you want me to revert, and
you can take over as release manager, for 1.24.0 at least.

Don't worry--I already know to start with commit
4201c8d10650c64823fdc423db4ff66174f9b0d6.

Or, crazy thought, we could label the CJK/UTF-16 feature as experimental
in the release notes, explicitly disclaim any plans for gropdf support
for it, and ask for help refining it from users of the other output
drivers it affects, since approximately no one tests any aspect of groff
behavior until a release is available anyway.

Then, once we've learned from actual users what's wrong with it, we can
improve it for the groff 1.25 release, whereupon another set of
users--not necessarily _completely_ distinct!--will emerge and claim
that they have a hard dependency on every aspect of groff 1.24.0's
CJK/UTF-16 font support, including the bugs.



    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?67971>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/

Attachment: signature.asc
Description: PGP signature

Reply via email to