[ 
https://issues.apache.org/jira/browse/PDFBOX-3432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Doswald updated PDFBOX-3432:
------------------------------------
    Attachment: patch_for_CustomMap_VS_GSCollections_benchmark.patch
                fontbox-benchmark-CustomMap-VS-GSCollections.zip

I've created a patch and a benchmark to compare the GS-Collections int map with 
the custom int map I wrote. The GS-Collections code had to be stripped down 
quite a bit. I've squashed the class hierarchy of IntIntHashMap and inlined 
some static functions.

To my surprise, it seems that in this use-case, the GS-Collections are slower 
than the custom IntIntMap. It could be because of the different ways one has to 
iterate over the map entries (GS-Collections creates an IntIntPair object for 
each mapping).

The performance comparison numbers:

Desktop
PdfBoxBenchmark.loadTTFFontCustomIntIntMap  avgt   1438.574 ± 37.288  us/op
PdfBoxBenchmark.loadTTFFontGSIntIntMap      avgt   1569.250 ± 34.920  us/op

Embedded
PdfBoxBenchmark.loadTTFFontCustomIntIntMap  avgt  28274.989 ± 742.245  us/op
PdfBoxBenchmark.loadTTFFontGSIntIntMap      avgt  36742.291 ± 919.297  us/op

Maybe I've made a mistake in the benchmark?


> Optimize CID to GlyphId mapping (TTF)
> -------------------------------------
>
>                 Key: PDFBOX-3432
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-3432
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: FontBox
>    Affects Versions: 2.0.1, 2.0.2, 2.0.3
>         Environment: Ubuntu 14.04.4 LTS
>            Reporter: Michael Doswald
>            Priority: Trivial
>              Labels: optimization, performance
>             Fix For: 2.0.3, 2.1.0
>
>         Attachments: PDFBOX-3432_Optimize_CID_to_GlyphId_mapping_rev1.patch, 
> fontbox-benchmark-CustomMap-VS-GSCollections.zip, 
> patch_for_CustomMap_VS_GSCollections_benchmark.patch, 
> pdfbox-performance-PDFBOX-3432.zip
>
>
> TTF fonts map code-points (Code IDs) to glyphs. These are mappings from int 
> to int. Because the JDK lacks map classes for primitive types, the code (e.g. 
> in CmapSubtable) currently uses Map<Integer,Integer> for those mappings. This 
> is inefficient in different ways:
> * Autoboxing/unboxing introduces a performance penalty
> * Boxing to Integer objects has a memory overhead
> * The JDK Map implementation has a big memory overhead for such simple objects
> For efficiency (execution time and memory consumption) I would propose to 
> introduce a simple IntIntMap implementation which works with primitive 
> integers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to