Follow-up Comment #3, bug #68022 (group groff): On Friday, 6 February 2026 13:28:57 GMT G. Branden Robinson wrote: > Follow-up Comment #2, bug #68022 (group groff): > >>> Assigning to Deri for his feedback. >> >> I'm not to keen on this, although I may be persuaded. Let me explain >> my logic:- >> >> Bits >> ==== >> >> 1. = subset > > I see Savannah "helpfully" rewrote your list enumerator here. :-| > > Un-corrupting it: >> 0 = subset [embedded fonts] >> 1 = compact by using space glyph (if available) >> 2 = compress streams >> 3 = remove font data (may create non-standard pdf) >> >> Setting value to 1 = do it, to zero = don't do it. >> >> The first 3 are things which you definitely want to do, the last is >> definitely dodgy, so the default value is 7. >> >> Since bit 3 is "dangerous" I think it is "safer" to conciously add 8 >> to the value to cause it to happen. >> >> Please explain your logic similarly. > > Your presentation makes sense to me. I think of these as a set of > "optimizations" (maybe that's what you were going for with `--opt`, > rather than the "options" I inferred).
The problem is they are not all "optimisations", certainly not bit 3, it will
produce a less than optimal pdf. In terms of size of pdf produced I can see
your thought process.
Perhaps it will help if I tell you why each facility is there.
Bit 0: The font subsetting code is new and complex, I'd be a fool to imagine
it will work perfectly with every font in existence. So if a problem is
discovered with a particular font, I can ask to re-run with this bit off to
test if the problem is related to the new code, if it works then at least the
user has a workable solution, which takes the pressure off me to find the
actual problem with the code.
Bit 1: Switching to using the "non-space" algorithm is in fact automatic on a
font by font basis. Gropdf used to output a message each time it switched but
I believe you committed a change to only output the message if gropdf was in
debug mode, which I agreed with. Our EURO font has no space glyph so the
message was appearing whenever the EURO font was active. This bit was included
so that I could force the non-space algorithm on fonts with a space glyph to
ensure both algorithms produced visually identical results.
Bit 2: Optionally every dictionary object in a pdf can be immediately followed
by a stream (stream...endstream), think of it as the payload and the
dictionary is the attributes associated with it. Streams are normally flate
compressed which makes them unreadable with less or an editor, this bit stops
the flate compression.
Bit 3: Even with flate turned off streams which hold font data are binary and
encoded, so can't be "read" and are often a significant portion of the total
size of the pdf. When debugging a pdf issue, it helps to drop this binary font
data to make traversal with an editor easier.
As you can see these flags are technical tweaks to the pdf, not
optimisations.
> In fact, once I read your logic, I thought immediately of gzip
> compression, where bigger numbers (generally) mean tighter "crunching".
>
> Maybe you could ward off confusion among other readers by giving this
> option a full-word long name _and_ a one-letter alias?
>
> How about:
>
>
> --optimizations=bit-mask
> -O bit-mask
> Specify output optimization features that generally reduce
> the size of the generated PDF. bit-mask is a vector of
> bits, each selecting an operation. Sum the following
> values as desired.
>
> Value Meaning
> -----------------------------------------------------
> 1 Subset embedded fonts, retaining only glyphs
> used in the document.
> 2 Make text more compact by using space glyphs
> instead of motions. Fonts that do not
> include a space glyph may conflict with this
> feature.
> [GBR: how? Will there be an error message?]
No it's automatic on a font by font basis. See above.
As an example of a program getting it wrong (using spaces in a font lacking
the space glyph) see the attached png.
> 4 Compress all streams.
> [GBR: "Streams" is undefined in this man
> page. I suppose this is some kind of
> embedded PDF data structure. Can you offer
> one or two words more? What does ISO 32000
> call them?]
streams (see above)
> 8 Don't embed font files required by the
> document. A document employing any font
> other than the base 14 of the PDF standard is
> unlikely to render satisfactorily.
>
> The default bit-mask is 7. To mimic what gropdf from groff
> 1.23 and earlier produced, specify "6", turning off
> subsetting.
>
>
> Again, I'm happy to contribute code changes, and if you like my recast
> partial man page, that work's almost already done.
I still think --opt (-O) is better given they are not really optimisations.
(file #58221)
_______________________________________________________
Additional Item Attachment:
Name: NoSpaceFont.png Size: 51KiB
<https://file.savannah.gnu.org/file/NoSpaceFont.png?file_id=58221>
AGPL NOTICE
These attachments are served by Savane. You can download the corresponding
source code of Savane at
https://savannah.gnu.org/source/savane-5c89ea1de46466fa27ad9decaeba153e39db01be.tar.gz
_______________________________________________________
Reply to this item at:
<https://savannah.gnu.org/bugs/?68022>
_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/
signature.asc
Description: PGP signature
