[jira] [Resolved] (PDFBOX-2778) PDF to Image conversion fails with "Invalid code word encountered"

Tilman Hausherr (JIRA) Thu, 30 Apr 2015 09:56:23 -0700

     [ 
https://issues.apache.org/jira/browse/PDFBOX-2778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Tilman Hausherr resolved PDFBOX-2778.
-------------------------------------
       Resolution: Fixed
    Fix Version/s: 2.0.0
                   1.8.10

A look at the file (created by "TWZTYWdlbVBkZg") showed that the two images had 
the same compressed length, and were padded with 0xFF at the end. Further 
investigation showed that the number of rows wasn't passed to our decoder, 
although it can handle it.

> PDF to Image conversion fails with "Invalid code word encountered"
> ------------------------------------------------------------------
>
>                 Key: PDFBOX-2778
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-2778
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Rendering
>    Affects Versions: 1.8.9, 2.0.0
>            Reporter: Siegfried Goeschl
>            Assignee: Tilman Hausherr
>              Labels: CCITTFaxDecode, ccitt
>             Fix For: 1.8.10, 2.0.0
>
>         Attachments: invalid-code-word-01.pdf
>
>
> One real-life PDF throws 
> pdfbox-1.8.9> ./pdf-to-image.sh pdf/invalid-code-word-01.pdf 
> java -jar pdfbox-app-1.8.9.jar PDFToImage pdf/invalid-code-word-01.pdf
> Apr 28, 2015 8:57:43 PM org.apache.pdfbox.util.operator.pagedrawer.Invoke 
> process
> SEVERE: java.io.IOException: Invalid code word encountered
> java.io.IOException: Invalid code word encountered
>       at 
> org.apache.pdfbox.io.ccitt.CCITTFaxG31DDecodeInputStream$NonLeafLookupTreeNode.getNextCodeWord(CCITTFaxG31DDecodeInputStream.java:360)
>       at 
> org.apache.pdfbox.io.ccitt.CCITTFaxG31DDecodeInputStream$NonLeafLookupTreeNode.getNextCodeWord(CCITTFaxG31DDecodeInputStream.java:358)
>       at 
> org.apache.pdfbox.io.ccitt.CCITTFaxG31DDecodeInputStream$NonLeafLookupTreeNode.getNextCodeWord(CCITTFaxG31DDecodeInputStream.java:358)
>       at 
> org.apache.pdfbox.io.ccitt.CCITTFaxG31DDecodeInputStream$NonLeafLookupTreeNode.getNextCodeWord(CCITTFaxG31DDecodeInputStream.java:358)
>       at 
> org.apache.pdfbox.io.ccitt.CCITTFaxG31DDecodeInputStream$NonLeafLookupTreeNode.getNextCodeWord(CCITTFaxG31DDecodeInputStream.java:358)
>       at 
> org.apache.pdfbox.io.ccitt.CCITTFaxG31DDecodeInputStream$NonLeafLookupTreeNode.getNextCodeWord(CCITTFaxG31DDecodeInputStream.java:358)
>       at 
> org.apache.pdfbox.io.ccitt.CCITTFaxG31DDecodeInputStream$NonLeafLookupTreeNode.getNextCodeWord(CCITTFaxG31DDecodeInputStream.java:358)
>       at 
> org.apache.pdfbox.io.ccitt.CCITTFaxG31DDecodeInputStream$NonLeafLookupTreeNode.getNextCodeWord(CCITTFaxG31DDecodeInputStream.java:358)
>       at 
> org.apache.pdfbox.io.ccitt.CCITTFaxG31DDecodeInputStream$NonLeafLookupTreeNode.getNextCodeWord(CCITTFaxG31DDecodeInputStream.java:358)
>       at 
> org.apache.pdfbox.io.ccitt.CCITTFaxG31DDecodeInputStream$NonLeafLookupTreeNode.getNextCodeWord(CCITTFaxG31DDecodeInputStream.java:358)
>       at 
> org.apache.pdfbox.io.ccitt.CCITTFaxG31DDecodeInputStream$NonLeafLookupTreeNode.getNextCodeWord(CCITTFaxG31DDecodeInputStream.java:358)
>       at 
> org.apache.pdfbox.io.ccitt.CCITTFaxG31DDecodeInputStream.decodeLine(CCITTFaxG31DDecodeInputStream.java:143)
>       at 
> org.apache.pdfbox.io.ccitt.CCITTFaxG31DDecodeInputStream.read(CCITTFaxG31DDecodeInputStream.java:104)
>       at java.io.InputStream.read(InputStream.java:170)
>       at java.io.FilterInputStream.read(FilterInputStream.java:133)
>       at 
> org.apache.pdfbox.io.ccitt.FillOrderChangeInputStream.read(FillOrderChangeInputStream.java:45)
>       at java.io.FilterInputStream.read(FilterInputStream.java:107)
>       at org.apache.pdfbox.io.IOUtils.copy(IOUtils.java:68)
>       at org.apache.pdfbox.io.IOUtils.toByteArray(IOUtils.java:52)
>       at 
> org.apache.pdfbox.filter.CCITTFaxDecodeFilter.decode(CCITTFaxDecodeFilter.java:110)
>       at org.apache.pdfbox.cos.COSStream.doDecode(COSStream.java:379)
>       at org.apache.pdfbox.cos.COSStream.doDecode(COSStream.java:291)
>       at 
> org.apache.pdfbox.cos.COSStream.getUnfilteredStream(COSStream.java:225)
>       at 
> org.apache.pdfbox.pdmodel.graphics.xobject.PDCcitt.getRGBImage(PDCcitt.java:201)
>       at 
> org.apache.pdfbox.util.operator.pagedrawer.Invoke.process(Invoke.java:87)
>       at 
> org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:557)
>       at 
> org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:268)
>       at 
> org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:235)
>       at 
> org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:215)
>       at org.apache.pdfbox.pdfviewer.PageDrawer.drawPage(PageDrawer.java:139)
>       at org.apache.pdfbox.pdmodel.PDPage.convertToImage(PDPage.java:801)
>       at 
> org.apache.pdfbox.util.PDFImageWriter.writeImage(PDFImageWriter.java:130)
>       at org.apache.pdfbox.PDFToImage.main(PDFToImage.java:226)
>       at org.apache.pdfbox.PDFBox.main(PDFBox.java:96)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Resolved] (PDFBOX-2778) PDF to Image conversion fails with "Invalid code word encountered"

Reply via email to