Re: Parsing TTF in memory in FontBox 3.x

Daniel Gredler Tue, 10 Feb 2026 06:52:47 -0800

> we could get UnitsPerEm because we read that table in
HeaderTable.readHeaders()
> Do you need the complete "head" or "hhea" table?


Just parts of the hhea table (ascender, descender, line gap). I'm not sure
it's a good idea to start down this path, though, as there will always be a
few more fields that will be useful to someone :-)

Instead of an on-demand mode, what do you think of adding the option of
table-level granularity to the parse request, e.g. instead of:

parser.parse(buffer)

you could call

parser.parse(buffer, NamingTable.TAG,
HeaderTable.TAG, OS2WindowsMetricsTable.TAG, HorizontalHeaderTable.TAG)

to select the specific tables that you need ahead of time.

If FileSystemFontProvider.scanFonts() is very performance-sensitive, it
would probably need to continue using parseTableHeaders... but if there's a
little wiggle room, it might be able to use this table-level granularity as
well. Right now it's not only selecting specific tables to read, but also
the specific fields in each table that it wants to read (so even higher
granularity than just table-level).

Take care,

Daniel


On Tue, Feb 10, 2026 at 1:29 PM Tilman Hausherr <[email protected]>
wrote:

> Am 10.02.2026 um 12:55 schrieb Daniel Gredler:
> > Hi Tilman,
> >
> > It looks like I may need to go back to 2.x for now. The on-demand table
> > loading removed in PDFBOX-5460 (
> > https://issues.apache.org/jira/browse/PDFBOX-5460) was important for me,
> > and the header-only mode added in PDFBOX-5847 (
> > https://issues.apache.org/jira/browse/PDFBOX-5847), which might have
> served
> > as a sort of replacement, doesn't include all of the information I'm
> > currently reading (e.g. units per em or horizontal header info), since
> it's
> > very focused on just supporting the needs of
> > FileSystemFontProvider.scanFonts(). Is a more generic on-demand load mode
> > completely off the table for 3.0?
>
> Re your last question, I'd prefer not to touch that part due to the
> problems we had when it existed. But we could get UnitsPerEm because we
> read that table in HeaderTable.readHeaders(). What do you mean with
> "horizontal header info"? Do you need the complete "head" or "hhea"
> table? That would be too much. It might be easier for you to fork that
> code and delete everything you don't need and just get the values you
> want. And then use the official fontbox for the actual work.
>
> Tilman
>
>
>
> >
> > Take care,
> >
> > Daniel
> >
> >
> >
> > On Mon, Feb 9, 2026 at 10:15 PM Daniel Gredler<[email protected]>
> wrote:
> >
> >> Actually, it looks like a JIRA entry won't be necessary. Switching to
> >> OTFParser seems to do the trick for this file.
> >>
> >> Further, I think the difference in behavior was due to the disappearance
> >> of the "parseOnDemand" option, which I was using (when on-demand is
> enabled
> >> in FontBox 2.x, this CFF validation doesn't run).
> >>
> >> Thanks again for the pointers!
> >>
> >> Daniel
> >>
> >>
> >>
> >> On Mon, Feb 9, 2026 at 9:13 PM Daniel Gredler<[email protected]>
> wrote:
> >>
> >>> That's odd. It has a ".otf" file extension, and FontBox 2.x didn't seem
> >>> to have any issues with it.
> >>>
> >>> I'll create a JIRA issue with more information, since it sounds like
> >>> feature parity is expected here (and I should be able to attach the
> >>> offending file).
> >>>
> >>> Take care,
> >>>
> >>> Daniel
> >>>
> >>>
> >>> On Mon, Feb 9, 2026 at 8:57 PM Tilman Hausherr<[email protected]>
> >>> wrote:
> >>>
> >>>> Am 09.02.2026 um 20:43 schrieb Daniel Gredler:
> >>>>> Ah, got it -- thanks! This indeed fixed the EOF issue.
> >>>>>
> >>>>> However, I'm now getting the following error: "True Type fonts using
> >>>> CFF
> >>>>> outlines are not supported"
> >>>>>
> >>>>> This is while reading the Noto Sans CJK Regular font file. Is this an
> >>>> area
> >>>>> where FontBox 3.x functionality is more limited than FontBox 2.x was?
> >>>> No, you should get the same exception in 2.0. If you don't have it as
> a
> >>>> ttf file, it should work as a ttc file, however the calls are
> different
> >>>>
> >>>> https://github.com/notofonts/noto-cjk/tree/main/Sans/OTC
> >>>>
> >>>> https://github.com/notofonts/noto-cjk/tree/main/Sans
> >>>>
> >>>> from the EmbeddedMultipleFonts.java example
> >>>>
> >>>> TrueTypeCollection ttc2 = new TrueTypeCollection(new
> >>>> File("c:/windows/fonts/batang.ttc"));
> >>>>
> >>>> PDType0Font font2 = PDType0Font.load(document,
> >>>> ttc2.getFontByName("Batang"), true); // Korean
> >>>>
> >>>>
> >>>> TrueTypeCollection can take an inputstream.
> >>>>
> >>>> However I see from that website that ttf files are also available, if
> >>>> you know which country you need.
> >>>>
> >>>>
> >>>> Tilman
> >>>>
> >>>>
> >>>>
> >>>> ---------------------------------------------------------------------
> >>>> To unsubscribe, e-mail:[email protected]
> >>>> For additional commands, e-mail:[email protected]
> >>>>
> >>>>
>

Re: Parsing TTF in memory in FontBox 3.x

Reply via email to