Thanks guys, (I'm on v1.4.1 here for our main repository).
Your right - I only tried index-all from the command line earlier when I was trying to figure out why this wasn't working - apologies, an example of brain freeze!! I had a quiet "D'oh" moment when someone mentioned filter-media :-) I tried filter-media from the command line and it did indeed bomb out fairly early on due to a protected PDF/bouncy castle type error which is presumably why the cron filter-media wasn't doing its' job. I dropped the bouncy castle PDF jars into the lib directory (copied over from a v1.4.2 repo I'm also running), re-ran filter-media and that seems to have done the trick - my PDF has now been filtered and indexed and can be search from within DSpace :-). Interestingly I did still get a couple of errors, but these didn't stop the filter-media process as was the case previously (I don't know if this is because of the new jars or if these are less serious errors than the one that previously caused filter-media to bomb out) - just for reference, these are the errors I'm seeing: ERROR filtering, skipping bitstream #364 java.util.NoSuchElementException java.util.NoSuchElementException at java.util.AbstractList$Itr.next(AbstractList.java:426) at org.textmining.text.extraction.WordExtractor.extractText(WordExtractor.j ava:150) at org.dspace.app.mediafilter.WordFilter.getDestinationStream(WordFilter.ja va:97) at org.dspace.app.mediafilter.MediaFilter.processBitstream(MediaFilter.java :155) at org.dspace.app.mediafilter.MediaFilterManager.filterBitstream(MediaFilte rManager.java:327) at org.dspace.app.mediafilter.MediaFilterManager.filterItem(MediaFilterMana ger.java:296) at org.dspace.app.mediafilter.MediaFilterManager.applyFiltersItem(MediaFilt erManager.java:266) at org.dspace.app.mediafilter.MediaFilterManager.applyFiltersAllItems(Media FilterManager.java:234) at org.dspace.app.mediafilter.MediaFilterManager.main(MediaFilterManager.ja va:185) ERROR filtering, skipping bitstream #169 java.io.IOException: Error decrypting document, details: Error: The supplied password does not match either the owner or user password in the document. java.io.IOException: Error decrypting document, details: Error: The supplied password does not match either the owner or user password in the document. at org.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:208) at org.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:149) at org.dspace.app.mediafilter.PDFFilter.getDestinationStream(PDFFilter.java :110) at org.dspace.app.mediafilter.MediaFilter.processBitstream(MediaFilter.java :155) at org.dspace.app.mediafilter.MediaFilterManager.filterBitstream(MediaFilte rManager.java:327) at org.dspace.app.mediafilter.MediaFilterManager.filterItem(MediaFilterMana ger.java:296) at org.dspace.app.mediafilter.MediaFilterManager.applyFiltersItem(MediaFilt erManager.java:266) at org.dspace.app.mediafilter.MediaFilterManager.applyFiltersAllItems(Media FilterManager.java:234) at org.dspace.app.mediafilter.MediaFilterManager.main(MediaFilterManager.ja va:185) Thanks again for all the useful advice and pointers, and for helping me to sort this out (and getting me past my brain freeze!). Cheers, Mike Michael White eLearning Developer Centre for eLearning Development (CeLD) S7, The Library University of Stirling Stirling SCOTLAND FK9 4LA Email: [EMAIL PROTECTED] Tel: +44 (0) 1786 466877 Fax: +44 (0) 1786 466880 http://www.is.stir.ac.uk/celd/ -- The University of Stirling (a charity registered in Scotland, number SCO11159) is a university established in Scotland by charter at Stirling, FK9 4LA. Privileged/Confidential Information may be contained in this message. If you are not the addressee indicated in this message (or responsible for delivery of the message to such person), you may not disclose, copy or deliver this message to anyone and any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. In such case, you should destroy this message and kindly notify the sender by reply email. Please advise immediately if you or your employer do not consent to Internet email for messages of this kind. ------------------------------------------------------------------------- This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone _______________________________________________ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech