[
https://issues.apache.org/jira/browse/PDFBOX-4909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17164376#comment-17164376
]
Tilman Hausherr commented on PDFBOX-4909:
-----------------------------------------
I prefer my own proposed change because it changes less, compared to the
original code; I don't really like CapturingSetFontAndSize because it adds some
complexity, while a map is easier to understand. But I'd love to hear arguments
for / against whatever.
I did some benchmarks but these are not very reliable. TestTextStripper takes
between 6 and 7 seconds on my machine but there is no clear pattern whether one
or the other solution is faster.
> Don't calculate font height for every glyph
> -------------------------------------------
>
> Key: PDFBOX-4909
> URL: https://issues.apache.org/jira/browse/PDFBOX-4909
> Project: PDFBox
> Issue Type: Improvement
> Components: Text extraction
> Affects Versions: 2.0.0, 3.0.0 PDFBox
> Reporter: Alfred
> Assignee: Tilman Hausherr
> Priority: Major
> Labels: Optimization
> Fix For: 2.0.21, 3.0.0 PDFBox
>
> Attachments: PDFBOX-4909.patch, Untitled.png,
> WithCapturingSetFontAndSize.png
>
>
> LegacyPDFStreamEngine computes font height for every glyph and the
> computation is rather heavy, to work around all known problems.
> Instead of computing for every glyph, we can recompute only when the font
> changes. The SetFontAndSize operator will be invoked when the font changes so
> we can use that to compute and store the height to have it ready when needed
> in showGlyph.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]