[
https://issues.apache.org/jira/browse/PDFBOX-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16707121#comment-16707121
]
Itai Shaked commented on PDFBOX-4392:
-------------------------------------
I started looking at the warning message, and the potential performance impact
of copying and re-parsing the entire ICC profile in case it is Perceptual, and
I came across something peculiar, which may be a bug.
In line 220 of the file PDICCBased.java the following test is used to check
whether the profile is perceptual:
{{ if (profileData[ICC_Profile.icHdrRenderingIntent] ==
ICC_Profile.icPerceptual) }}
Where icHdrRenderingIntent has the value 64 and icPerceptual is 0.
The ICC format specification
([http://www.color.org/specification/ICC1v43_2010-12.pdf)] has a table in page
19 describing the format of the header, in which the field Rendering Intent is
indeed in bytes 64-67 of the header. In page 23 however, where Rendering Intent
is described, it says "The field is a uInt32Number in which the
least-significant 16 bits shall be used to encode the rendering intent. The
most significant 16 bits shall be set to zero (0000h)." . Since the entire
format is Big-Endian, this to me means the value to be checked should actually
be in index 67 (profileData[ICC_Profile.icHdrRenderingIntent + 3]), and the
current test will always return true - regardless of the rendering intent.
If this is indeed the case, PDFBox may be wrongfully changing profiles, which
may impact both performance and accuracy.
Or am I misunderstanding the specification?
> PDF completely blow up the RAM on amazon instances
> --------------------------------------------------
>
> Key: PDFBOX-4392
> URL: https://issues.apache.org/jira/browse/PDFBOX-4392
> Project: PDFBox
> Issue Type: Bug
> Affects Versions: 2.0.12
> Reporter: Oleksandr Skoryi
> Priority: Major
> Fix For: 2.0.13
>
> Attachments: 2f0f8f77-7a85-416d-b5d2-47a07d1416d4_3.pdf
>
>
> Hi all
> The issue is pretty straightforward. I receive a lot of pdfs every day and
> render them. In most of the cases everything is OK, but PDFs which produces
> WARN org.apache.pdfbox.pdmodel.graphics.color.PDICCBased - ICC profile is
> Perceptual, ignoring, treating as Display class
> working super long, and are super memory consumable.
> It takes from 5 to 15 min on m5.large amazon instance. But attached PDF
> completely killed the instance. The java process is just killed by linux
> during processing with no exception in logs.
> So could you please provide explanations what is going on with files with
> WARN message above, and how can I improve the rendering.
>
> Here is my VM options
> -Dorg.apache.pdfbox.rendering.UsePureJavaCMYKConversion=true -Xmx3G -Xms2G
> -Dsun.java2d.cmm=sun.java2d.cmm.kcms.KcmsServiceProvider"
> Also don't hesitate to ask me about more PDF, I have tones of them :D
>
> And also a question, does GPU have influence on rendering?
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]