Petr Slaby created PDFBOX-3338:
----------------------------------

             Summary: CCITT Fax decoder fails
                 Key: PDFBOX-3338
                 URL: https://issues.apache.org/jira/browse/PDFBOX-3338
             Project: PDFBox
          Issue Type: Bug
    Affects Versions: 2.0.1, 1.8.12
            Reporter: Petr Slaby


I have a PDF which does not render in PDFBox. It contains pages from a scanner, 
encoded as CCITT Fax Tiffs. On each page, the decoder always runs into 
IOException("TIFFFaxDecoder: EOL encountered in black run.")  (or the same 
message just with "white" instead of "black"). Unfortunately, the PDF contains 
sensitive data and I cannot share it.

As a test, I have replaced the TIFFFaxDecoder by the class 
CCITTFaxDecoderStream from the Twelve Monkeys ImageIO library. All worked fine 
after that and PDFToImage produced the expected result. 

I have extracted the first few bytes of the TIFF to show the problem without 
sharing the confidential content. See the attached test program and test file.

I have tested this against latest trunk version of PDFBox, but I think the 
decoder implementation is basically the same in all versions. 




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to