[ 
https://issues.apache.org/jira/browse/PDFBOX-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16707121#comment-16707121
 ] 

Itai Shaked commented on PDFBOX-4392:
-------------------------------------

I started looking at the warning message, and the potential performance impact 
of copying and re-parsing the entire ICC profile in case it is Perceptual, and 
I came across something peculiar, which may be a bug. 

In line 220 of the file PDICCBased.java the following test is used to check 
whether the profile is perceptual:  

 

{{  if (profileData[ICC_Profile.icHdrRenderingIntent] == 
ICC_Profile.icPerceptual)  }}

Where icHdrRenderingIntent has the value 64 and icPerceptual is 0. 

The ICC format specification 
([http://www.color.org/specification/ICC1v43_2010-12.pdf)] has a table in page 
19 describing the format of the header, in which the field Rendering Intent is 
indeed in bytes 64-67 of the header. In page 23 however, where Rendering Intent 
is described, it says "The field is a uInt32Number in which the 
least-significant 16 bits shall be used to encode the rendering intent. The 
most significant 16 bits shall be set to zero (0000h)." .  Since the entire 
format is Big-Endian, this to me means the value to be checked should actually 
be in index 67 (profileData[ICC_Profile.icHdrRenderingIntent + 3]), and the 
current test will always return true - regardless of the rendering intent.  

If this is indeed the case, PDFBox may be wrongfully changing profiles, which 
may impact both performance and accuracy.  

Or am I misunderstanding the specification? 

> PDF completely blow up the RAM on amazon instances
> --------------------------------------------------
>
>                 Key: PDFBOX-4392
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4392
>             Project: PDFBox
>          Issue Type: Bug
>    Affects Versions: 2.0.12
>            Reporter: Oleksandr Skoryi
>            Priority: Major
>             Fix For: 2.0.13
>
>         Attachments: 2f0f8f77-7a85-416d-b5d2-47a07d1416d4_3.pdf
>
>
> Hi all
> The issue is pretty straightforward. I receive a lot of pdfs every day and 
> render them. In most of the cases everything is OK, but PDFs which produces 
> WARN org.apache.pdfbox.pdmodel.graphics.color.PDICCBased - ICC profile is 
> Perceptual, ignoring, treating as Display class
> working super long, and are super memory consumable. 
> It takes from 5 to 15 min on m5.large amazon instance. But attached PDF 
> completely killed the instance. The java process is just killed by linux 
> during processing with no exception in logs. 
> So could you please provide explanations what is going on with files with 
> WARN message above, and how can I improve the rendering. 
>  
> Here is my VM options 
> -Dorg.apache.pdfbox.rendering.UsePureJavaCMYKConversion=true -Xmx3G -Xms2G 
> -Dsun.java2d.cmm=sun.java2d.cmm.kcms.KcmsServiceProvider"
> Also don't hesitate to ask me about more PDF, I have tones of them :D
>  
> And also a question, does GPU have influence on rendering?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to