[
https://issues.apache.org/jira/browse/PDFBOX-955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tilman Hausherr reopened PDFBOX-955:
------------------------------------
Reproduced In: 2.0.0
PDF Files with G4 images are blank again. It can be reproduced with the file
d0000040.pdf of PDFBOX-955. The reason seems to be that the pixels of the
embedded TIF files are reversed, and then drawn on a white image. So we get
white on white, i.e. nothing. I "prove" my point with this change in
pdfbox\pdfviewer\PageDrawer.java (this is not a fix, but it will hopefully give
a hint):
public void drawImage(Image awtImage, AffineTransform at)
{
graphics.setComposite(getGraphicsState().getStrokeJavaComposite());
graphics.setClip(getGraphicsState().getCurrentClippingPath());
//these two lines from me
graphics.setColor(Color.BLACK);
graphics.fillRect(0, 0, 5000, 5000);
graphics.drawImage(awtImage, at, null);
}
Now the rendered file is no longer white only, it is white on black. I suspect
that the problem is somehow related to transparant backgrounds / pixels.
> Can't extract b/w images from PDF
> ---------------------------------
>
> Key: PDFBOX-955
> URL: https://issues.apache.org/jira/browse/PDFBOX-955
> Project: PDFBox
> Issue Type: Improvement
> Affects Versions: 1.4.0
> Environment: Windows XP prof, Java 1.6.0_22, Netbeans 6.9.1
> Reporter: Tilman Hausherr
> Assignee: Andreas Lehmkühler
> Priority: Minor
> Labels: extract
> Fix For: 1.6.0
>
> Attachments: d0000040-01.png, d0000040.pdf, ExtractImages.java,
> PDFBOX955-d00000401.png, PDFBOX955-photo1.png, photo.jpg, photo.pdf
>
>
> I wrote a test application using org.apache.pdfbox.ExtractImages to...
> extract images as PNG. (This is the start of something bigger, which involves
> making a statistic about the content of over a million pages within PDF
> files) However all images I get are all black or all white when I test on our
> own PDF files. I did get correct images from a file that had color images. To
> extract, I tried page.convertToImage() and then writing with ImageIO.write(),
> but I also tried using PDFImageWriter, neither had success for b/w images.
> The sample PDF is not confidential; it does give a warning "getRGBImage
> returned NULL" but other PDFs that don't give the warning (but are
> confidential) also fail.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira