[ 
https://issues.apache.org/jira/browse/PDFBOX-3523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15549209#comment-15549209
 ] 

Tilman Hausherr commented on PDFBOX-3523:
-----------------------------------------

tldr: it's the interpolation.

Thank you for answering and sorry I asked so much, but I thought it was a 
newbie error (it isn't). Your code is fine (although you might get slightly 
better results with the ImageUtils methods). The first PDF (which you deleted) 
renders with 1.8 with 2 colors, and with 2.0.3 with 256 unique colors. This is 
due to high quality interpolation. It is even more extreme in the second file, 
which is in color. You can see this with IrfanView, press "i".

The cause is this code:
{code}
    public void drawImage(PDImage pdImage) throws IOException
    {
        Matrix ctm = getGraphicsState().getCurrentTransformationMatrix();
        AffineTransform at = ctm.createAffineTransform();

        if (!pdImage.getInterpolate())
        {
            boolean isScaledUp = pdImage.getWidth() < 
Math.round(at.getScaleX()) ||
                                 pdImage.getHeight() < 
Math.round(at.getScaleY());

            // if the image is scaled down, we use smooth interpolation, eg 
PDFBOX-2364
            // only when scaled up do we use nearest neighbour, eg PDFBOX-2302 
/ mori-cvpr01.pdf
            // stencils are excluded from this rule (see survey.pdf)
            if (isScaledUp || pdImage.isStencil())
            {
                graphics.setRenderingHint(RenderingHints.KEY_INTERPOLATION,
                        RenderingHints.VALUE_INTERPOLATION_NEAREST_NEIGHBOR);
            }
        }

        if (pdImage.isStencil())
        {
            // fill the image with paint
            //TODO why no soft mask?
            BufferedImage image = 
pdImage.getStencilImage(getNonStrokingPaint());

            // draw the image
            drawBufferedImage(image, at);
        }
        else
        {
            // draw the image
            drawBufferedImage(pdImage.getImage(), at);
        }

        if (!pdImage.getInterpolate())
        {
            // JDK 1.7 has a bug where rendering hints are reset by the above 
call to
            // the setRenderingHint method, so we re-set all hints, see 
PDFBOX-2302
            setRenderingHints();
        }
    }
{code}
With your 2nd PDF (I didn't keep the 1st one), isScaledUp is false, because it 
is drawn at its original size in the PDF, so it doesn't switch to low quality 
interpolation. Our code does not consider your resolution (250).

So I made a change that considers the resolution too, so that it switches to 
low quality interpolation when the image is expanded. If you build from source 
you can test it:
{code}
            Matrix m = new Matrix(xform);
            m.concatenate(ctm);
            boolean isScaledUp = pdImage.getWidth() < 
Math.round(Math.abs(m.getScalingFactorX()))
                    || pdImage.getHeight() < 
Math.round(Math.abs(m.getScalingFactorY()));
{code}
However the quality of the images went down in my tests (that are done at 96 
dpi). This can be seen with

- 166292-fi-ligature.pdf
- PDFBOX-1679.pdf
- PDFBOX-2307-162362.pdf pages 3, 5, 7
- PDFBOX-2552-altimage.pdf (left bird only)
- PDFBOX-2302-mori-cvpr01.pdf page 7
- sigice9_172.Adobe.pdf

I'm attaching two of them done with 96dpi.

> PDFBox renders images 5 times slower and bigger
> -----------------------------------------------
>
>                 Key: PDFBOX-3523
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-3523
>             Project: PDFBox
>          Issue Type: Improvement
>    Affects Versions: 2.0.3
>         Environment: Java version 1.8
> Ubuntu Linux&Windows 10
>            Reporter: Vasiliy Sadokhin
>              Labels: performance
>         Attachments: OldPdfUtils.java, PdfUtils.java, old_test.png, test.pdf, 
> test.png
>
>
> We recently migrated PDFBox from 1.8.13 to 2.0.3. We found that it becomes 5 
> times slower to get a PDF page image and the image is 5 times bigger than 
> PDFBox 1.8 did. For example, It took about 200ms and now it takes more than 1 
> second, the result size was less 200Kb and it's 1Mb now. 
> We specified BufferedImage.TYPE_3BYTE_BGR for PDFBox 1.8 and we have no way 
> to do it with 2.x. It might be a reason.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to