[ 
https://issues.apache.org/jira/browse/PDFBOX-4309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16610415#comment-16610415
 ] 

Timo Boehme commented on PDFBOX-4309:
-------------------------------------

I've found the reason why the 'direct-draw' solution is slower than mine and 
also is much slower on other pages of the problematic document (e.g. 9 seconds 
vs. 0.2 seconds): in PDICCBased.loadICCProfile() some operations are performed 
to trigger exceptions in order to fall back to alternate color space. The 
trigger awtColorSpace.toRGB() results (in my environment) in a 0.4 second delay 
- it seems internally it also uses the slow color-convert operation.

I wanted to check if an alternative operation without this side-effect could be 
used, however I found no document to trigger the exception (in my environment). 
In the code there are following references to problematic documents:
 * PDFBOX-1295: triggers an exception but with trigger 'ComponentColorModel', 
not the 'toRGB'
 * PDFBOX-1740: same as PDFBOX-1295
 * PDFBOX-3610: no exception

Thus its not clear to me if the trigger 'toRGB' is still needed. At least I 
would like to have a switch to disable this trigger so that the trigger by 
default is 'on' for compatibility. For PDFBOX version 3.x we could maybe remove 
it - in case we don't find any documents the trigger is good for.

> Performance regression in PDColorSpace#toRGBImageAWT Part 2
> -----------------------------------------------------------
>
>                 Key: PDFBOX-4309
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4309
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: Rendering
>    Affects Versions: 2.0.11, 3.0.0 PDFBox
>            Reporter: Timo Boehme
>            Assignee: Timo Boehme
>            Priority: Minor
>              Labels: optimization
>         Attachments: PDColorSpace.java.patch, PDICCBased.java.patch
>
>
> This is a continuation of PDFBOX-3569. In a (private) PDF document there are 
> graphics produced by CorelDraw which are combined by more than 2500(!) 
> images, each with its own indexed color space based on an ICC color space 
> (the shadows of graphic objects are created by large number of gray lines 
> ...). In our environment (OpenJDK 7 and OpenJDK 8, IcedTea, Suse Linux 64Bit) 
> rendering a single page with one graphic takes 780 seconds. The most time is 
> spent in creating the indexed color space via ICC color space mapping:
> {noformat}
>    java.lang.Thread.State: RUNNABLE
>         at sun.java2d.cmm.lcms.LCMS.createNativeTransform(Native Method)
>         at sun.java2d.cmm.lcms.LCMS.createTransform(LCMS.java:156)
>         at 
> sun.java2d.cmm.lcms.LCMSTransform.doTransform(LCMSTransform.java:155)
>         - locked <0x0000000723af9e30> (a sun.java2d.cmm.lcms.LCMSTransform)
>         at 
> sun.java2d.cmm.lcms.LCMSTransform.colorConvert(LCMSTransform.java:268)
>         at java.awt.image.ColorConvertOp.ICCBIFilter(ColorConvertOp.java:355)
>         at java.awt.image.ColorConvertOp.filter(ColorConvertOp.java:282)
>         at 
> org.apache.pdfbox.pdmodel.graphics.color.PDColorSpace.toRGBImageAWT(PDColorSpace.java:314)
>         at 
> org.apache.pdfbox.pdmodel.graphics.color.PDICCBased.toRGBImage(PDICCBased.java:276)
>         at 
> org.apache.pdfbox.pdmodel.graphics.color.PDIndexed.initRgbColorTable(PDIndexed.java:141)
>         at 
> org.apache.pdfbox.pdmodel.graphics.color.PDIndexed.<init>(PDIndexed.java:91)
>         at 
> org.apache.pdfbox.pdmodel.graphics.color.PDColorSpace.create(PDColorSpace.java:184)
>         at 
> org.apache.pdfbox.pdmodel.graphics.color.PDColorSpace.create(PDColorSpace.java:70)
>         at 
> org.apache.pdfbox.pdmodel.graphics.color.PDColorSpace.createFromCOSObject(PDColorSpace.java:240)
>         at 
> org.apache.pdfbox.pdmodel.graphics.color.PDColorSpace.create(PDColorSpace.java:92)
>         at 
> org.apache.pdfbox.pdmodel.graphics.color.PDColorSpace.create(PDColorSpace.java:70)
>         at 
> org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject.getColorSpace(PDImageXObject.java:672)
>         at 
> org.apache.pdfbox.pdmodel.graphics.image.SampledImageReader.getRGBImage(SampledImageReader.java:196)
>         at 
> org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject.getImage(PDImageXObject.java:443)
>         at 
> org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject.getImage(PDImageXObject.java:424)
>         at 
> org.apache.pdfbox.rendering.PageDrawer.drawImage(PageDrawer.java:1046){noformat}
> The call of LittleCMS (LCMS) multi thousand times is the problem here taking 
> way to much time. Unfortunately using kcms via 
> {{-Dsun.java2d.cmm=sun.java2d.cmm.kcms.KcmsServiceProvider}} is also no 
> option as the Suse IceadTea OpenJDK seems to not have included it (anymore?) 
> - in both Java 7 and Java 8.
> However the ICC color space (PDICCBased) returns in this case CMYK as 
> alternate color space and for CMYK we have the alternative rendering via 
> system property org.apache.pdfbox.rendering.UsePureJavaCMYKConversion from 
> PDFBOX-3569.
> The idea is now to have an option to force using the alternative color space 
> instead of the ICC one to circumvent using LCMS in toRGBImage(). For CMYK as 
> alternative color space it has to be combined with the system property 
> 'UsePureJavaCMYKConversion'.
> Using this approach the rendering time of the page with the problematic 
> graphic drops from 780 seconds to 1 second!
> It is clear that using the alternate color space might return wrong/not exact 
> colors. Therefore it should be only an option to enable this mode. However 
> for processing large collections of PDF documents (e.g. focusing on text) or 
> to display a PDF in a timely manner the performance improvement should 
> outperform the drop in image quality.
> While the provided patch will use the alternate color space if activated in 
> any case, it could be possible at a later stage to add more intelligent logic 
> which decides on a runtime analysis when to use this mode (number of calls to 
> LCMS, time needed etc.).
> If there are no objections with this patch I will apply it in the next days.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to