If you switch from using PDFBox to XPDF, most if not all of these errors will
disappear. As a bonus, your filter-media will run much, much faster too!
Google "DSpace AND installing XPDF", you'll find a bunch of articles on how to
do this.
Best of luck,
Sue
Sue Walker-Thornton
(w): (757) 864-2368
(m): (757) 506-9903
From: Brett Arno [mailto:[email protected]]
Sent: Wednesday, April 25, 2012 5:15 PM
To: [email protected]
Subject: [Dspace-tech] Media Filter Errors
Hello All,
I'm receiving a good portion of errors when running the filter-media command
and wondering if anyone can provide some insight.
I'm running 1.7.2 XMLUI with Mirage on a Red Hat server. Most items in the
instance give this error:
ERROR filtering, skipping bitstream:
Item Handle: 10829/669
Bundle Name: ORIGINAL
File Size: 43066
Checksum: d302cf0378a385ff16610d63943b5368 (MD5)
Asset Store: 0
java.io.IOException: No such file or directory
java.io.IOException: No such file or directory
at java.io.UnixFileSystem.createFileExclusively(Native Method)
at java.io.File.createNewFile(File.java:900)
at edu.sdsc.grid.io.local.LocalFile.createNewFile(LocalFile.java:486)
at
org.dspace.storage.bitstore.BitstreamStorageManager.store(BitstreamStorageManager.java:300)
at org.dspace.content.Bitstream.create(Bitstream.java:205)
at org.dspace.content.Bundle.createBitstream(Bundle.java:384)
at
org.dspace.app.mediafilter.MediaFilterManager.processBitstream(MediaFilterManager.java:760)
at
org.dspace.app.mediafilter.MediaFilterManager.filterBitstream(MediaFilterManager.java:561)
at
org.dspace.app.mediafilter.MediaFilterManager.filterItem(MediaFilterManager.java:511)
at
org.dspace.app.mediafilter.MediaFilterManager.applyFiltersItem(MediaFilterManager.java:479)
at
org.dspace.app.mediafilter.MediaFilterManager.applyFiltersAllItems(MediaFilterManager.java:414)
at
org.dspace.app.mediafilter.MediaFilterManager.main(MediaFilterManager.java:333)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:183)
The first thing I checked was that the items files exist and can be accessed
through the system and from my large test sample, they all had PDFs and all
opened fine.
I've tried completely rebuilding the index to see if that may help, but that
didn't change the results. I've tried indexing with these commands and none of
them helped:
[dspace]/bin/dpsace index-update
[dspace]/bin/dspace index-init
[dspace]/bin/dspace index-init -r -f
We are using the PDF filter that was issued with the system and not using
discovery.
I also noticed this error in the DSpace log:
WARN org.apache.pdfbox.util.PDFStreamEngine @ java.io.IOException: Error:
expected hex character and not :32
java.io.IOException: Error: expected hex character and not :32
at org.apache.fontbox.cmap.CMapParser.parseNextToken(CMapParser.java:336)
at org.apache.fontbox.cmap.CMapParser.parse(CMapParser.java:139)
at org.apache.pdfbox.pdmodel.font.PDFont.parseCmap(PDFont.java:556)
at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:390)
at
org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:386)
at org.apache.pdfbox.util.operator.ShowText.process(ShowText.java:45)
at
org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:567)
at
org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:250)
at
org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:208)
at
org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:378)
at
org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:302)
at
org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:258)
at
org.dspace.app.mediafilter.PDFFilter.getDestinationStream(PDFFilter.java:101)
at
org.dspace.app.mediafilter.MediaFilterManager.processBitstream(MediaFilterManager.java:737)
at
org.dspace.app.mediafilter.MediaFilterManager.filterBitstream(MediaFilterManager.java:561)
at
org.dspace.app.mediafilter.MediaFilterManager.filterItem(MediaFilterManager.java:511)
at
org.dspace.app.mediafilter.MediaFilterManager.applyFiltersItem(MediaFilterManager.java:479)
at
org.dspace.app.mediafilter.MediaFilterManager.applyFiltersAllItems(MediaFilterManager.java:414)
at
org.dspace.app.mediafilter.MediaFilterManager.main(MediaFilterManager.java:333)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:183)
Any help would be greatly appreciated!
--
Brett Arno
Library Systems Support Specialist
Herrick Memorial Library
Alfred University
1 Saxon Drive
Alfred, NY 14802
Email: [email protected]<mailto:[email protected]> | Phone: 607-871-2989
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
DSpace-tech mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-tech