[
https://issues.apache.org/jira/browse/PDFBOX-4875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17134714#comment-17134714
]
Tilman Hausherr commented on PDFBOX-4875:
-----------------------------------------
SonarCloud
[complains|https://sonarcloud.io/project/issues?id=pdfbox-reactor&issues=AXKtDSmGtAjNMTTDoX69&open=AXKtDSmGtAjNMTTDoX69]
about the synchronization, doesn't like strings because they're pooled.
However I think there is another problem: the code synchronizes on different
objects. But we do a concurrent access on FONTS, which is not protected against
concurrent access by different font names.
> Lazy load standard 14 fonts, only if needed
> -------------------------------------------
>
> Key: PDFBOX-4875
> URL: https://issues.apache.org/jira/browse/PDFBOX-4875
> Project: PDFBox
> Issue Type: Improvement
> Components: Parsing, Text extraction
> Affects Versions: 2.0.20, 3.0.0 PDFBox
> Reporter: Alfred
> Priority: Major
> Labels: Optimization
> Fix For: 2.0.21, 3.0.0 PDFBox
>
> Attachments: PDFBOX-4875.patch
>
> Original Estimate: 1m
> Remaining Estimate: 1m
>
> I am testing text extraction from PDF and profiling the execution.
> I found that the second biggest time consumer is the static code in
> Standard14Fonts that loads fonts from the pdf box jar.
> Looking at the code I realized we don't have to load all fonts statically,
> when the class loads.
> Not all PDFs need all fonts, so, if we lazy loaded them, only when needed, it
> will save some time and some memory.
> The memory part in particular would be important when running on a tablet or
> a phone, where the entire memory space of the app is 80M - 160M.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]