[
https://issues.apache.org/jira/browse/PDFBOX-457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13912589#comment-13912589
]
Tilman Hausherr commented on PDFBOX-457:
----------------------------------------
Current status with the 2.0 version: the fax file renders ok, the scan file
(JBIG2 based) renders ok too if the levigo plugin is used. Its probably ok with
1.8.4 too, didn't test. The original file still has a group 4 issue.
26.02.2014 07:56:55.681 WARN [main] org.apache.pdfbox.util.PDFStreamEngine:552
- java.lang.RuntimeException: Invalid code encountered while decoding 2D group
4 compressed data.
java.lang.RuntimeException: Invalid code encountered while decoding 2D group 4
compressed data.
at
org.apache.pdfbox.filter.ccitt.TIFFFaxDecoder.decodeT6(TIFFFaxDecoder.java:1010)
at
org.apache.pdfbox.filter.CCITTFaxFilter.decode(CCITTFaxFilter.java:115)
at org.apache.pdfbox.filter.Filter.decode(Filter.java:58)
at org.apache.pdfbox.cos.COSStream.doDecode(COSStream.java:332)
at org.apache.pdfbox.cos.COSStream.doDecode(COSStream.java:281)
at org.apache.pdfbox.cos.COSStream.getDecodeResult(COSStream.java:231)
at
org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject.<init>(PDImageXObject.java:80)
at
org.apache.pdfbox.pdmodel.graphics.PDXObject.createXObject(PDXObject.java:65)
at
org.apache.pdfbox.pdmodel.PDResources.getXObjects(PDResources.java:248)
at
org.apache.pdfbox.pdmodel.graphics.form.PDFormXObject.getResources(PDFormXObject.java:127)
at
org.apache.pdfbox.util.operator.pagedrawer.Invoke.process(Invoke.java:105)
at
org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:539)
at
org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:267)
at
org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:234)
at
org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:216)
at org.apache.pdfbox.pdfviewer.PageDrawer.drawPage(PageDrawer.java:147)
at org.apache.pdfbox.util.RenderUtil.renderPage(RenderUtil.java:206)
at org.apache.pdfbox.util.RenderUtil.convertToImage(RenderUtil.java:170)
at pdfboxpageimageextraction.ExtractImages.doPdf(ExtractImages.java:278)
at pdfboxpageimageextraction.ExtractImages.main(ExtractImages.java:83)
> PDF to Image doesn't show correctly the document
> ------------------------------------------------
>
> Key: PDFBOX-457
> URL: https://issues.apache.org/jira/browse/PDFBOX-457
> Project: PDFBox
> Issue Type: Bug
> Components: Rendering
> Affects Versions: 0.8.0-incubator
> Reporter: Marcelo Tavares
> Assignee: Daniel Wilson
> Labels: CCITTFaxDecode, TIFF, ccitt
> Attachments: 580505.PR00003.000003.PDF,
> pdfbox-457-Scan_from_a_Xerox_WorkCentre_Pro.PDF, pdfbox-457-as_fax.pdf,
> pdfbox-457.PNG, testPDFToImage1.png
>
>
> I tried to convert the following document to image, but I got the attached
> result.
> It parsed just the text. I also tried different formats like JPG. I ran it
> using the PDFToImage class passing the document path as parameter.
> I've read that sometimes the document is not created respecting the PDF
> standard. But, is there a possibility to ignore it?! In fact, it's very
> important to me, so, could I use PDF Box despite of those "errors"?
> Thank you
> Marcelo
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)