[ https://issues.apache.org/jira/browse/PDFBOX-5997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Axel Howind updated PDFBOX-5997: -------------------------------- Flags: Patch > avoid creation of temporary objects when parsing hex values > ----------------------------------------------------------- > > Key: PDFBOX-5997 > URL: https://issues.apache.org/jira/browse/PDFBOX-5997 > Project: PDFBox > Issue Type: Improvement > Reporter: Axel Howind > Priority: Major > Attachments: > avoid_creation_of_temporary_objects_when_parsing_hex_strings_version2.patch, > avoid_the_creation_of_temporary_string_instances_when_parsing_hex_values_version1.patch > > > There currently are two places where hex numbers are parsed in PDFBox, the > Hex and COSString classes. The current implementation instantiates several > temporary objects for each conversion: > 1. trim() is called on the String, creating a copy if the String is not yet > trimmed. > 2. a Stringbuilder is created containing the String and possibly a padding 0. > This has to copy the whole character arrangement every time. > 3. for each pair of hex digits, substring() is called, creating a new String > instances (or looking it up in the String pool > I have created two different patches for this. One that also replaces the > Integer.parseInt() call and one that uses an overload of the method. Both > should be much more performant and reduce GC activity. You might want to run > a benchmark to decide which one to use. > version 1 also does not rely on exception handling which is inherently slow > to handle incorrect hex data. version two still uses exception handling, but > should nevertheless improve performance and reduce GC activity. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org