[ 
https://issues.apache.org/jira/browse/PDFBOX-5997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17947767#comment-17947767
 ] 

Tilman Hausherr commented on PDFBOX-5997:
-----------------------------------------

I used only one line change for now in PDFBOX-5996 because we're close to a 
release and I'm also too busy now to have a look. Maybe next week.

> avoid creation of temporary objects when parsing hex values
> -----------------------------------------------------------
>
>                 Key: PDFBOX-5997
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-5997
>             Project: PDFBox
>          Issue Type: Improvement
>            Reporter: Axel Howind
>            Priority: Major
>         Attachments: 
> avoid_creation_of_temporary_objects_when_parsing_hex_strings_version2.patch, 
> avoid_the_creation_of_temporary_string_instances_when_parsing_hex_values_version1.patch
>
>
> There currently are two places where hex numbers are parsed in PDFBox, the 
> Hex and COSString classes. The current implementation instantiates several 
> temporary objects for each conversion:
> 1. trim() is called on the String, creating a copy if the String is not yet 
> trimmed.
> 2. a Stringbuilder is created containing the String and possibly a padding 0. 
> This has to copy the whole character arrangement every time.
> 3. for each pair of hex digits, substring() is called, creating a new String 
> instances (or looking it up in the String pool
> I have created two different patches for this. One that also replaces the 
> Integer.parseInt() call and one that uses an overload of the method. Both 
> should be much more performant and reduce GC activity. You might want to run 
> a benchmark to decide which one to use.
> version 1 also does not rely on exception handling which is inherently slow 
> to handle incorrect hex data. version two still uses exception handling, but 
> should nevertheless improve performance and reduce GC activity.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to