[
https://issues.apache.org/jira/browse/PDFBOX-4022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16264607#comment-16264607
]
Tilman Hausherr commented on PDFBOX-4022:
-----------------------------------------
Please set the first property mentioned here:
https://pdfbox.apache.org/2.0/getting-started.html
The file is slow because it has 678 tiny images with an ICC colorspace. I get
the file rendered in 25 seconds. With your patch, it is done in 7 seconds.
So caching does help. However we can't use your patch because it would cache
forever and grow and grow... It would keep the COSBase colorSpace even after
closing the file. We do currently cache colorspaces but only those that are
directly in the resources, not those in an image. I'll add that.
So although your patch wasn't used, your analysis helped, thank you for it.
> Cache ColorSpace instances in PDColorSpace.java
> -----------------------------------------------
>
> Key: PDFBOX-4022
> URL: https://issues.apache.org/jira/browse/PDFBOX-4022
> Project: PDFBox
> Issue Type: Improvement
> Components: Rendering
> Affects Versions: 2.0.7, 2.0.8
> Environment: Windows 10
> Reporter: savan patel
> Assignee: Tilman Hausherr
> Labels: patch
> Attachments: PDColorSpace.java.patch, selection.pdf
>
>
> I have a PDF which contains a lot of Images. Each time, pdfbox parse the
> image it uses the colorspace to parse it and most of the time colorspace is
> the same object. I tried to cache the colorspace instances in
> PDColorSpace.java as shown in the attached patch. For this particular PDF,
> this change improves the performance of parsing and rendering the images from
> 15 minutes down to 3 minutes. I shared the PDF.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]