Hi,
it sounds easier than it is.
As Tilman already mentioned, those standard 14 type 1 fonts are more or
less limited to latin text. Anything else won't work.
There are additional issues to solve:
- you have to deal with different well-known predefined encodings
- when it comes to complex fonts you have to deal with different
arbitrary font-specific encodings. Some of them don't provide a working
toUnicode-mapping, so that you won't be able to map such encodings to
another encoding
- the replacement font has to offer the some characters than the
replaced one
- the width of a single character may differ within the source and
target font. In many cases the content stream relies on those values
when placing single characters, so that the rendering may be different
These are just the obvious points. There are maybe others.
However, you may deal with some of them with more or less effort, but
some are mostly impossible without recreating the whole content stream
including some OCR to replace a missing toUnicode-mapping.
You mentioned LibreOffice, but I'm afraid that is another use case. They
are replacing the fonts within a word document. IMHO that is much more
easier to achieve.
To sum it up, PDFBox doesn't provide any easy way to do what you are
looking for and there won't be any in the future, unless some genius
finds a way and provides a patch to be included ;-)
Andreas
Am 16.11.24 um 19:41 schrieb r.barc...@habmalnefrage.de.INVALID:
Tilman,
> However the standard 14 type 1 helvetica isn't embedded anyway so maybe
this is moot.
Exactly, I want the standard type 1 helvetica which isn't embedded, instead of
a proprietary font.
Please see my attached sketch.
For my app, I have two choices:
1. Detect fonts like Arial and reject the template PDF: "Sorry, your PDF cannot be
used as a template because it contains the proprietary font %s!"
2. "Oh, contains Arial! But: No problem! I know that font and I'll replace that for
you by the default Helvetica which doesn't have to be embedded."
For unknown fonts other than Arial or Times New Roman, I'd have to go with
option (1) anyway.
LibreOffice has a feature that points into the same direction:
https://blog.documentfoundation.org/blog/2020/09/08/libreoffice-tt-replacing-microsoft-fonts/
Yours,
Reg
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org