[ 
https://issues.apache.org/jira/browse/PDFBOX-606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jérôme Mainaud updated PDFBOX-606:
----------------------------------

    Attachment: ._pellochmar10.pdf

Here is an exemple. The file not a correct PDF. It is a file containing the 
preview of the file made by MacOs. You can reproduice the same loop by changing 
the extension of a jpeg to pdf.

The file should be rejected and not loop forever.

> infinite loop encountered in PushBackInputStream.read
> -----------------------------------------------------
>
>                 Key: PDFBOX-606
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-606
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 0.8.0-incubator
>            Reporter: Nicholas Blair
>         Attachments: ._pellochmar10.pdf
>
>
> While processing customer content for Lucene index using PDFBox, encountered 
> an infinite loop in PDDocument.load, stack trace:
> java.io.FileInputStream.readBytes(Native Method)
> java.io.FileInputStream.read(FileInputStream.java:199)
> java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
> java.io.BufferedInputStream.read(BufferedInputStream.java:317)
>    - locked java.io.bufferedinputstr...@f5ef5d
> java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
> java.io.BufferedInputStream.read(BufferedInputStream.java:237)
>    - locked java.io.bufferedinputstr...@15b9c29
> java.io.FilterInputStream.read(FilterInputStream.java:66)
> java.io.PushbackInputStream.read(PushbackInputStream.java:122)
> org.apache.pdfbox.io.PushBackInputStream.read(PushBackInputStream.java:84)
> org.apache.pdfbox.pdfparser.BaseParser.skipSpaces(BaseParser.java:1190)
> org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionary(BaseParser.java:188)
> org.apache.pdfbox.pdfparser.PDFParser.parseTrailer(PDFParser.java:767)
> org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:456)
> org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:179)
> org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:841)
> org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:808)
> edu.wisc.mywebspace.search.pdf.PdfDocumentContentParser.parse(PdfDocumentContentParser.java:47)
> Calling code looks like:
> document = PDDocument.load(inputStream);

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to