[ 
https://issues.apache.org/jira/browse/PDFBOX-2976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14953699#comment-14953699
 ] 

Andreas Lehmkühler edited comment on PDFBOX-2976 at 10/13/15 6:59 AM:
----------------------------------------------------------------------

Updated patch, the exception is swallowed only if some data could be read 
before.

WDYT? Should we change the behaviour of the FlateFilter (throw vs. swallow 
exception in some cases) to parse such streams at least partly?


was (Author: lehmi):
Updated patch, the exception is swallowed only if some data could be read before

> java.util.zip.DataFormatException: incorrect data check
> -------------------------------------------------------
>
>                 Key: PDFBOX-2976
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-2976
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 2.0.0
>         Environment: Linux Mint 17.2 x64, JDK7u79, Glassfish 3.1.2.2
>            Reporter: Felix Rudolphi
>            Assignee: Andreas Lehmkühler
>         Attachments: PDFBOX2976_FlateFilter2.patch, sc-356376(1)-x.pdf, 
> sc-356376(1).pdf, sc-356376-x.pdf, sc-356376.pdf
>
>   Original Estimate: 3h
>  Remaining Estimate: 3h
>
> When trying to open certain PDF files (examples attached, also any MSDS 
> available at http://www.scbt.com/datasheet-356376.html ), an expection is 
> thrown resulting in the file not being parsed:
> java.io.IOException: java.util.zip.DataFormatException: incorrect data check
>       at org.apache.pdfbox.filter.FlateFilter.decode(FlateFilter.java:83)
>       at org.apache.pdfbox.cos.COSInputStream.create(COSInputStream.java:78)
>       at org.apache.pdfbox.cos.COSStream.createInputStream(COSStream.java:160)
>       at 
> org.apache.pdfbox.cos.COSStream.getUnfilteredStream(COSStream.java:143)
>       at org.apache.pdfbox.pdmodel.PDPage.getContents(PDPage.java:148)
>       at 
> org.apache.pdfbox.pdfparser.PDFStreamParser.<init>(PDFStreamParser.java:92)
>       at 
> org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:450)
>       at 
> org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:437)
>       at 
> org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngine.java:148)
>       at 
> org.apache.pdfbox.text.PDFTextStreamEngine.processPage(PDFTextStreamEngine.java:117)
>       at 
> org.apache.pdfbox.text.PDFTextStripper.processPage(PDFTextStripper.java:367)
>       at 
> org.apache.pdfbox.text.PDFTextStripper.processPages(PDFTextStripper.java:303)
>       at 
> org.apache.pdfbox.text.PDFTextStripper.writeText(PDFTextStripper.java:248)
>       at 
> org.apache.pdfbox.text.PDFTextStripper.getText(PDFTextStripper.java:209)
> -- or --
> java.io.IOException: java.util.zip.DataFormatException: incorrect data check
>       at org.apache.pdfbox.filter.FlateFilter.decode(FlateFilter.java:83)
>       at org.apache.pdfbox.cos.COSInputStream.create(COSInputStream.java:78)
>       at org.apache.pdfbox.cos.COSStream.createInputStream(COSStream.java:160)
>       at 
> org.apache.pdfbox.cos.COSStream.getUnfilteredStream(COSStream.java:143)
>       at org.apache.pdfbox.pdmodel.PDPage.getContents(PDPage.java:148)
>       at 
> org.apache.pdfbox.pdfparser.PDFStreamParser.<init>(PDFStreamParser.java:92)
>       at 
> org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:450)
>       at 
> org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:437)
>       at 
> org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngine.java:148)
>       at org.apache.pdfbox.rendering.PageDrawer.drawPage(PageDrawer.java:179)
>       at 
> org.apache.pdfbox.rendering.PDFRenderer.renderPage(PDFRenderer.java:205)
>       at 
> org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:136)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to