[
https://issues.apache.org/jira/browse/PDFBOX-3418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael Doswald updated PDFBOX-3418:
------------------------------------
Attachment: PDFBOX-3418_ToUnicodeWriter_performance_rev1.patch
PDFBOX-3418_PerformanceTest.zip
Added a proposed patch to speed up the performance of ToUnicodeWriter. Also, a
JMH benchmark project is included.
My test results show that the changes have just a small impact on desktop
computers, but a significant impact on my embedded system (i.MX6 DL):
Desktop:
OLD: PdfBoxBenchmark.loadEmbeddedFont avgt 10 76.644 ± 1.295 ms/op
NEW: PdfBoxBenchmark.loadEmbeddedFont avgt 10 60.510 ± 1.265 ms/op
Embedded:
OLD: PdfBoxBenchmark.loadEmbeddedFont avgt 10 1075.366 ? 32.550 ms/op
NEW: PdfBoxBenchmark.loadEmbeddedFont avgt 10 665.002 ? 31.051 ms/op
Also, the allocation rate has decreased significantly. Below are the
measurements on my desktop system.
OLD:
PdfBoxBenchmark.loadEmbeddedFont:·gc.alloc.rate avgt 10
542.965 ± 9.165 MB/sec
PdfBoxBenchmark.loadEmbeddedFont:·gc.alloc.rate.norm avgt 10
43351852.752 ± 437.214 B/op
NEW:
PdfBoxBenchmark.loadEmbeddedFont:·gc.alloc.rate avgt 10
273.424 ± 9.800 MB/sec
PdfBoxBenchmark.loadEmbeddedFont:·gc.alloc.rate.norm avgt 10
17341439.557 ± 474.988 B/op
> Slow string to hex conversion in ToUnicodeWriter
> ------------------------------------------------
>
> Key: PDFBOX-3418
> URL: https://issues.apache.org/jira/browse/PDFBOX-3418
> Project: PDFBox
> Issue Type: Improvement
> Components: PDModel
> Affects Versions: 2.0.2
> Environment: Ubuntu 14.04 LTS
> Reporter: Michael Doswald
> Priority: Trivial
> Labels: optimization, performance
> Attachments: PDFBOX-3418_PerformanceTest.zip,
> PDFBOX-3418_ToUnicodeWriter_performance_rev1.patch
>
>
> The ToUnicodeWriter.writeTo(OutputStream) method converts a lot of shorts and
> strings to hexadecimal strings. This is done with String.format and therefore
> not very efficient.
> The ToUnicodeWriter.toHex(int) and ToUnitcodeWriter.stringToHex(String)
> methods could be rewritten to generate a char-array and being generally more
> efficient.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]