[ 
https://issues.apache.org/jira/browse/PDFBOX-2381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14170556#comment-14170556
 ] 

Tilman Hausherr commented on PDFBOX-2381:
-----------------------------------------

Because the size of the pushback buffer must be set when creating 
PushbackInputStream and it can't be opened a second time. In the worst case, 
that size could be almost as large as the PDF file itself, e.g. the PDF file 
has just one big encoded object stream but with a wrong stream length. That's a 
design flaw of the "old" parser.

> BaseParser - IOException: Push back buffer is full
> --------------------------------------------------
>
>                 Key: PDFBOX-2381
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-2381
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 2.0.0
>            Reporter: John Hewson
>             Fix For: 2.0.0
>
>
> The file from PDFBOX-2320 can't be parsed with the PDFParser, but works with 
> the NonSequentialPDFParser.
> {code}
> Sep 25, 2014 10:34:51 AM org.apache.pdfbox.pdfparser.BaseParser parseCOSStream
> WARNING: Specified stream length 72519 is wrong. Fall back to reading stream 
> until 'endstream'.
> Exception in thread "main" java.io.IOException: Could not push back 72519 
> bytes in order to reparse stream. Try increasing push back buffer using 
> system property org.apache.pdfbox.baseParser.pushBackSize
>   at 
> org.apache.pdfbox.pdfparser.BaseParser.parseCOSStream(BaseParser.java:578)
>   at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:599)
>   at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:191)
>   at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1044)
>   at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1010)
>   at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:963)
>   at org.apache.pdfbox.tools.PDFToImage.main(PDFToImage.java:201)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at com.intellij.rt.execution.application.AppMain.main(AppMain.java:134)
> Caused by: java.io.IOException: Push back buffer is full
>   at java.io.PushbackInputStream.unread(PushbackInputStream.java:215)
>   at 
> org.apache.pdfbox.io.PushBackInputStream.unread(PushBackInputStream.java:144)
>   at 
> org.apache.pdfbox.io.PushBackInputStream.unread(PushBackInputStream.java:133)
>   at 
> org.apache.pdfbox.pdfparser.BaseParser.parseCOSStream(BaseParser.java:574)
>   ... 11 more
> {code}
> I've not tried this with 1.8.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to