[ https://issues.apache.org/jira/browse/PDFBOX-3432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael Doswald updated PDFBOX-3432: ------------------------------------ Attachment: patch_for_CustomMap_VS_GSCollections_benchmark.patch fontbox-benchmark-CustomMap-VS-GSCollections.zip I've created a patch and a benchmark to compare the GS-Collections int map with the custom int map I wrote. The GS-Collections code had to be stripped down quite a bit. I've squashed the class hierarchy of IntIntHashMap and inlined some static functions. To my surprise, it seems that in this use-case, the GS-Collections are slower than the custom IntIntMap. It could be because of the different ways one has to iterate over the map entries (GS-Collections creates an IntIntPair object for each mapping). The performance comparison numbers: Desktop PdfBoxBenchmark.loadTTFFontCustomIntIntMap avgt 1438.574 ± 37.288 us/op PdfBoxBenchmark.loadTTFFontGSIntIntMap avgt 1569.250 ± 34.920 us/op Embedded PdfBoxBenchmark.loadTTFFontCustomIntIntMap avgt 28274.989 ± 742.245 us/op PdfBoxBenchmark.loadTTFFontGSIntIntMap avgt 36742.291 ± 919.297 us/op Maybe I've made a mistake in the benchmark? > Optimize CID to GlyphId mapping (TTF) > ------------------------------------- > > Key: PDFBOX-3432 > URL: https://issues.apache.org/jira/browse/PDFBOX-3432 > Project: PDFBox > Issue Type: Improvement > Components: FontBox > Affects Versions: 2.0.1, 2.0.2, 2.0.3 > Environment: Ubuntu 14.04.4 LTS > Reporter: Michael Doswald > Priority: Trivial > Labels: optimization, performance > Fix For: 2.0.3, 2.1.0 > > Attachments: PDFBOX-3432_Optimize_CID_to_GlyphId_mapping_rev1.patch, > fontbox-benchmark-CustomMap-VS-GSCollections.zip, > patch_for_CustomMap_VS_GSCollections_benchmark.patch, > pdfbox-performance-PDFBOX-3432.zip > > > TTF fonts map code-points (Code IDs) to glyphs. These are mappings from int > to int. Because the JDK lacks map classes for primitive types, the code (e.g. > in CmapSubtable) currently uses Map<Integer,Integer> for those mappings. This > is inefficient in different ways: > * Autoboxing/unboxing introduces a performance penalty > * Boxing to Integer objects has a memory overhead > * The JDK Map implementation has a big memory overhead for such simple objects > For efficiency (execution time and memory consumption) I would propose to > introduce a simple IntIntMap implementation which works with primitive > integers. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org