Hi folks We run pdfbox for pdf text extraction under the Dspace application.
Occasionally we get the odd failure, and we’re investigating some errors just now. One is: java.lang.RuntimeException: java.io.IOException: Not a number: + java.lang.RuntimeException: java.io.IOException: Not a number: + at org.apache.pdfbox.pdfparser.PDFStreamParser$1.tryNext(PDFStreamParser.java:178) at org.apache.pdfbox.pdfparser.PDFStreamParser$1.hasNext(PDFStreamParser.java:187) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:266) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:251) at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:225) at org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:442) at org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:366) at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:322) at org.dspace.app.mediafilter.PDFFilter.getDestinationStream(PDFFilter.java:101) And here’s another: -- Scott Renton Digital Development Library and University Collections Argyle House, Floor F ext: 515219
The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
--------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]

