Am 02.10.2017 um 00:50 schrieb Joel Hirsh:
The only difference I can see between PageDrawer in the two versions has to do with interpolation.  I am thinking that somehow interpolation defaults to off in 2.0.6 and on in 2.0.7.  Does that make sense?

No, it was never off. It may have been off in partial aspects, as in PDFBOX-3615 <https://issues.apache.org/jira/browse/PDFBOX-3615>.

Tilman


And I'm not sure that interpolating makes an image better if the original data doesn't have the resolution to begin with.  That could account for the fuzziness.

Thanks for the insight.

On Sun, Oct 1, 2017 at 3:22 PM, Tilman Hausherr <[email protected] <mailto:[email protected]>> wrote:

    Ah, now I get it. Initially I thought you meant size as in height
    / width.

    There have sometimes been some improvements re image quality; more
    quality = bigger images. The one on the right is the better one.
    Nobody would like an image with "blocky" glyphs.

    I remember one issue but that one was fixed in 2.0.4
    https://issues.apache.org/jira/browse/PDFBOX-3615
    <https://issues.apache.org/jira/browse/PDFBOX-3615>

    Another one is
    https://issues.apache.org/jira/browse/PDFBOX-1958
    <https://issues.apache.org/jira/browse/PDFBOX-1958>
    but that one was fixed in 2.0.5 (and sometimes made images worse)

    To find out what changed, I'd need to have a specific file to test
    with. I could then get different versions from the repository and
    build and test when this happened. (But it would be better if
    you'd do it)

    But currently we don't offer to switch this on or off. You could
    change it in the source code, in PageDrawer.java, search for
    setRenderingHints().

    Tilman



    Am 02.10.2017 um 00:00 schrieb Joel Hirsh:
    My code looks like this

    BufferedImage pageimage=
    *new*PDFRenderer(pdfdocument).renderImageWithDPI(pagenum, outputres);

    ImageIO./write/(pageimage, "png", tmpfile);

    where outputres is 300.0f.

    The only difference on my side is that in one case I build with
    pdfbox-app-2.0.6.jar, the other with pdfbox-app-2.0.7.jar

    In both versions I get a full page image that is 2550 x 3300
    pixels, with a bit depth of 24bits. However for this one page
    file I am looking at the 2.0.6 .png file is 1.53MB and the 2.0.7
    file is 7.83MB

    If I read both .png files into Photoshop and zoom way in, they
    are indeed different.  This is a screenshot of them side by side
    with 2.0.6 on the left and 2.0.7 on the right.

    Inline image 1

     Since gmail seemed to shrink the image when I pasted it in, I
    also included it as an attachment.

    If you look at the attachment closely you can see the photoshop
    pixel numbers on the left. It appears that 2.0.6 actually has a
    minimum resolution of 2 pixels, whereas 2.0.7 has a resolution of
    1 pixel.  But 2.0.7 seems slightly out of focus.  So as I said
    before, I'm not sure which is better, but it's not a difference I
    would expect. Or would like to have some control over.


    Regards


    On Sun, Oct 1, 2017 at 10:19 AM, Tilman Hausherr
    <[email protected] <mailto:[email protected]>> wrote:

        Hi,

        No idea what you mean. I have a test with about 1000 PDF
        files that renders at 96dpi and compares the result. If the
        images were bigger / smaller then this would be noticed.

        Please explain what size you get from what PDF with what
        version and what code.

        Tilman


        Am 01.10.2017 um 18:55 schrieb Joel Hirsh:

            I am using PDFRenderer.renderImageWithDPI and found that
            the generated
            images have changed substantially from 2.0.6 to 2.0.7. 
            Images are 2x to 4x
            bigger in 2.07 than 2.0.6 and really impact OCR
            processing on those images.

            Its hard to say whether or how one is 'better' than the
            other, but I'd like
            to understand what is happening and maybe how to control
            this.  I didn't
            see anything in the 2.0.7 release notes that seemed to
            indicate a
            difference.

            I have verified that I can go back and forth between
            2.0.6 and 2.0.7 and
            get consistent results within each version.  And
            everything else is the
            same.

            I am also using the following jars:
            levigo-jbig2-imageio-1.6.5.jar
            imageio-jpeg-3.2.1.jar
            imageio-metadata-3.2.1.jar
            imageio-core-3.2.1.jar
            common-image-3.2.1.jar
            jai-imageio-core-1.3.1.jar
            jai-imageio-jpeg2000-1.3.1_CODICE_1.jar

            Any light you can cast on this would be appreciated.



        ---------------------------------------------------------------------
        To unsubscribe, e-mail: [email protected]
        <mailto:[email protected]>
        For additional commands, e-mail: [email protected]
        <mailto:[email protected]>




    ---------------------------------------------------------------------
    To unsubscribe, e-mail:[email protected]
    <mailto:[email protected]>
    For additional commands, e-mail:[email protected] 
<mailto:[email protected]>




Reply via email to