Petr Slaby created PDFBOX-3338:
----------------------------------
Summary: CCITT Fax decoder fails
Key: PDFBOX-3338
URL: https://issues.apache.org/jira/browse/PDFBOX-3338
Project: PDFBox
Issue Type: Bug
Affects Versions: 2.0.1, 1.8.12
Reporter: Petr Slaby
I have a PDF which does not render in PDFBox. It contains pages from a scanner,
encoded as CCITT Fax Tiffs. On each page, the decoder always runs into
IOException("TIFFFaxDecoder: EOL encountered in black run.") (or the same
message just with "white" instead of "black"). Unfortunately, the PDF contains
sensitive data and I cannot share it.
As a test, I have replaced the TIFFFaxDecoder by the class
CCITTFaxDecoderStream from the Twelve Monkeys ImageIO library. All worked fine
after that and PDFToImage produced the expected result.
I have extracted the first few bytes of the TIFF to show the problem without
sharing the confidential content. See the attached test program and test file.
I have tested this against latest trunk version of PDFBox, but I think the
decoder implementation is basically the same in all versions.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]