PDFImageWriter extracts black images from arabic PDFs
-----------------------------------------------------

                 Key: PDFBOX-1072
                 URL: https://issues.apache.org/jira/browse/PDFBOX-1072
             Project: PDFBox
          Issue Type: Bug
          Components: Utilities
    Affects Versions: 1.6.0
            Reporter: Anton 


When I tried to extract a JPEG image from arabic PDF, i've got a corrupted file 
with black area which overlays all arabic text on each page.
In console i've got only this debug message and no other exceptions and so on:
DEBUG (PDPixelMap.java:241) - ColorModel: IndexColorModel: #pixelBits = 1 
numComponents = 4 color space = java.awt.color.ICC_ColorSpace@2eeb3c84 
transparency = 2 transIndex   = 1 has alpha = true isAlphaPre = false
This is not only one pdf file. I have about 400-500 files which produces the 
same thing.

Code:
PDFImageWriter writer = new PDFImageWriter();
PDDocument document = PDDocument.load(sourceFile);
writer.writeImage(document, "jpg", "", 1, 1, filename); 


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to