Hi, On 04/05/2011 09:07 AM, Shinichiro Abe wrote:
It seems like an error raised at pdfbox, and pdfbox cannot recognize something about XrefTable of the pdf? What kind of error is it?
The PDF in question might be malformed, or there could be a bug in PDFBox that prevents it from correctly parsing this file.
To solve the problem, the best way is to report the issue to the PDFBox issue tracker at https://issues.apache.org/jira/browse/PDFBOX, ideally with the sample PDF as an attachment.
Such troubles are fairly common when you are dealing with large numbers of files from various different sources. Usually they aren't too troublesome, as you often can live with not being able to search such documents based on their full text contents. For example in Apache Jackrabbit we simply log such problems and index the document as if it was empty. It's of course a good idea to report such issues so they can be fixed in future versions.
-- Jukka Zitting
