Re: Problems parsing a PDF

Andreas Lehmkühler Tue, 05 May 2009 11:14:34 -0700

Hi Georg,

Georg Datterl schrieb:

Hello dear PDFBox-Users.
Coming from fop via xmlgraphics and the fop-pdf-images package to PdfBox, I have to admit, I don't really know much about the internal structure of a PDF file. But I have a PDF file which I want to load using PDFParser.load(InputStream, null), but some way through the parsing process BaseParser.parseDirObject() throws an IOException("expected false actual='fa'"). I downloaded the latest source code for the class and in line 871 indeed the string "false" is expected, but "fa" followed by three empty bytes is received.Can anybody tell me, how I can find out why the pdf can not be parsed? Maybe the file is corrupted earlier, but Acrobat can display it and iText can parse it without problems. Since the PDF file has 2MB, I don't want to send it to the list, so if somebody would be so kind and take the time to look into the PDF, I could send the file per mail.

Please create an issue on jira [1] and attach the pdf-document in question.

Thanks,

Andreas Lehmkühler

[1] https://issues.apache.org/jira/browse/PDFBOX

Re: Problems parsing a PDF

Reply via email to