[
https://issues.apache.org/jira/browse/PDFBOX-5997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Axel Howind updated PDFBOX-5997:
--------------------------------
Attachment:
avoid_the_creation_of_temporary_string_instances_when_parsing_hex_values_version1.patch
> avoid creation of temporary objects when parsing hex values
> -----------------------------------------------------------
>
> Key: PDFBOX-5997
> URL: https://issues.apache.org/jira/browse/PDFBOX-5997
> Project: PDFBox
> Issue Type: Improvement
> Reporter: Axel Howind
> Priority: Major
> Attachments:
> avoid_the_creation_of_temporary_string_instances_when_parsing_hex_values_version1.patch
>
>
> There currently are two places where hex numbers are parsed in PDFBox, the
> Hex and COSString classes. The current implementation instantiates several
> temporary objects for each conversion:
> 1. trim() is called on the String, creating a copy if the String is not yet
> trimmed.
> 2. a Stringbuilder is created containing the String and possibly a padding 0.
> This has to copy the whole character arrangement every time.
> 3. for each pair of hex digits, substring() is called, creating a new String
> instances (or looking it up in the String pool
> I have created two different patches for this. One that also replaces the
> Integer.parseInt() call and one that uses an overload of the method. Both
> should be much more performant and reduce GC activity. You might want to run
> a benchmark to decide which one to use.
> version 1 also does not rely on exception handling which is inherently slow
> to handle incorrect hex data. version two still uses exception handling, but
> should nevertheless improve performance and reduce GC activity.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]