convertToImage mangles images which were in the PDF
---------------------------------------------------

                 Key: PDFBOX-958
                 URL: https://issues.apache.org/jira/browse/PDFBOX-958
             Project: PDFBox
          Issue Type: Bug
    Affects Versions: 1.4.0, 1.2.1
         Environment: WinXP, java version "1.6.0_23"
            Reporter: Eric Schwarzenbach
            Priority: Critical


Of the PDFs we've tried running through PDFBox and generating page images, a 
number of them (coming from disparate sources and method of creation) seem to 
produce images where an image that was embedded in the page of the PDF shows 
somewhat mangled. It seems to be divided by horizontal stripes, where some 
stripes look normal, others seem to have some kind of "smearing" effect going 
on. See attached images and original PDF (image is of page 13).

I marked this as critical as we are trying to use PDFBox in a project where 
page images are crucial, and inability to produce reasonable looking page 
images is pretty much a deal breaker. 

The code we use to extract the images looks more or less like the following:

                                        BufferedImage image = 
page.convertToImage();
                                        
                                        SmartDeferredFileOutputStream outStream 
= new SmartDeferredFileOutputStream();
                                        String[] writerFormatNames = 
ImageIO.getWriterFormatNames();
                                        ImageIO.write(image, "jpeg", outStream);
                                        outStream.close()

We've also tried specifying "png". In both "jpg" and "png" cases we get an 
image file that is indeed the correct format, and both images look exactly the 
same. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to