Hi folks

We run pdfbox for pdf text extraction under the Dspace application.

Occasionally we get the odd failure, and we’re investigating some errors just 
now.

One is:

java.lang.RuntimeException: java.io.IOException: Not a number: +

java.lang.RuntimeException: java.io.IOException: Not a number: +

at 
org.apache.pdfbox.pdfparser.PDFStreamParser$1.tryNext(PDFStreamParser.java:178)

at 
org.apache.pdfbox.pdfparser.PDFStreamParser$1.hasNext(PDFStreamParser.java:187)

at 
org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:266)

at 
org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:251)

at 
org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:225)

at org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:442)

at org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:366)

at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:322)

at org.dspace.app.mediafilter.PDFFilter.getDestinationStream(PDFFilter.java:101)


And here’s another:


--
Scott Renton
Digital Development
Library and University Collections
Argyle House, Floor F
ext: 515219

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to