Am 02.10.2017 um 00:50 schrieb Joel Hirsh:
The only difference I can see between PageDrawer in the two versions
has to do with interpolation. I am thinking that
somehow interpolation defaults to off in 2.0.6 and on in 2.0.7. Does
that make sense?
No, it was never off. It may have been off in partial aspects, as in
PDFBOX-3615 <https://issues.apache.org/jira/browse/PDFBOX-3615>.
Tilman
And I'm not sure that interpolating makes an image better if the
original data doesn't have the resolution to begin with. That could
account for the fuzziness.
Thanks for the insight.
On Sun, Oct 1, 2017 at 3:22 PM, Tilman Hausherr <[email protected]
<mailto:[email protected]>> wrote:
Ah, now I get it. Initially I thought you meant size as in height
/ width.
There have sometimes been some improvements re image quality; more
quality = bigger images. The one on the right is the better one.
Nobody would like an image with "blocky" glyphs.
I remember one issue but that one was fixed in 2.0.4
https://issues.apache.org/jira/browse/PDFBOX-3615
<https://issues.apache.org/jira/browse/PDFBOX-3615>
Another one is
https://issues.apache.org/jira/browse/PDFBOX-1958
<https://issues.apache.org/jira/browse/PDFBOX-1958>
but that one was fixed in 2.0.5 (and sometimes made images worse)
To find out what changed, I'd need to have a specific file to test
with. I could then get different versions from the repository and
build and test when this happened. (But it would be better if
you'd do it)
But currently we don't offer to switch this on or off. You could
change it in the source code, in PageDrawer.java, search for
setRenderingHints().
Tilman
Am 02.10.2017 um 00:00 schrieb Joel Hirsh:
My code looks like this
BufferedImage pageimage=
*new*PDFRenderer(pdfdocument).renderImageWithDPI(pagenum, outputres);
ImageIO./write/(pageimage, "png", tmpfile);
where outputres is 300.0f.
The only difference on my side is that in one case I build with
pdfbox-app-2.0.6.jar, the other with pdfbox-app-2.0.7.jar
In both versions I get a full page image that is 2550 x 3300
pixels, with a bit depth of 24bits. However for this one page
file I am looking at the 2.0.6 .png file is 1.53MB and the 2.0.7
file is 7.83MB
If I read both .png files into Photoshop and zoom way in, they
are indeed different. This is a screenshot of them side by side
with 2.0.6 on the left and 2.0.7 on the right.
Inline image 1
Since gmail seemed to shrink the image when I pasted it in, I
also included it as an attachment.
If you look at the attachment closely you can see the photoshop
pixel numbers on the left. It appears that 2.0.6 actually has a
minimum resolution of 2 pixels, whereas 2.0.7 has a resolution of
1 pixel. But 2.0.7 seems slightly out of focus. So as I said
before, I'm not sure which is better, but it's not a difference I
would expect. Or would like to have some control over.
Regards
On Sun, Oct 1, 2017 at 10:19 AM, Tilman Hausherr
<[email protected] <mailto:[email protected]>> wrote:
Hi,
No idea what you mean. I have a test with about 1000 PDF
files that renders at 96dpi and compares the result. If the
images were bigger / smaller then this would be noticed.
Please explain what size you get from what PDF with what
version and what code.
Tilman
Am 01.10.2017 um 18:55 schrieb Joel Hirsh:
I am using PDFRenderer.renderImageWithDPI and found that
the generated
images have changed substantially from 2.0.6 to 2.0.7.
Images are 2x to 4x
bigger in 2.07 than 2.0.6 and really impact OCR
processing on those images.
Its hard to say whether or how one is 'better' than the
other, but I'd like
to understand what is happening and maybe how to control
this. I didn't
see anything in the 2.0.7 release notes that seemed to
indicate a
difference.
I have verified that I can go back and forth between
2.0.6 and 2.0.7 and
get consistent results within each version. And
everything else is the
same.
I am also using the following jars:
levigo-jbig2-imageio-1.6.5.jar
imageio-jpeg-3.2.1.jar
imageio-metadata-3.2.1.jar
imageio-core-3.2.1.jar
common-image-3.2.1.jar
jai-imageio-core-1.3.1.jar
jai-imageio-jpeg2000-1.3.1_CODICE_1.jar
Any light you can cast on this would be appreciated.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
<mailto:[email protected]>
For additional commands, e-mail: [email protected]
<mailto:[email protected]>
---------------------------------------------------------------------
To unsubscribe, e-mail:[email protected]
<mailto:[email protected]>
For additional commands, e-mail:[email protected]
<mailto:[email protected]>