Hi Panyarak, The first of your errors is because a bitstream cannot be filtered. In this case the bitstream (a Word .doc file) has been saved using the 'fast save' option. See: http://support.microsoft.com/kb/197978 for an explanation of what this means.
The second error "SKIPPED: bitstream 40 (item: 123456789/29) because 'cop?????????2.pdf.txt' already exists" is not really an error. It just means that the item has already had its text extracted and indexed so it skips the file. Thanks, Stuart Lewis Digital Services Programmer Te Tumu Herenga The University of Auckland Library Auckland Mail Centre, Private Bag 92019, Auckland 1142, New Zealand Ph: 64 9 373-7599 x81928 http://www.library.auckland.ac.nz/ -----Original Message----- From: Panyarak Ngamsritragul [mailto:[email protected]] Sent: Monday, 27 April 2009 1:46 p.m. To: [email protected] Subject: [Dspace-tech] Error in Applying Media Filters Hi all, I set the DSpace to run $DSPACE/bin/filter-media as a cron job everyday. There are some errors generated as attached. Could someone tell what is going wrong ? I am using 1.5.2 in a Linux box. Thanks. -- Panyarak Ngamsritragul Department of Mechanical Engineering Prince of Songkla University. =========== Part of message from cron job =============== Applying Media Filters ERROR filtering, skipping bitstream: Item Handle: 123456789/6 Bundle Name: ORIGINAL File Size: 95232 Checksum: 16cf45ecdfa77470e71443295630cbf0 (MD5) Asset Store: 0 org.textmining.text.extraction.FastSavedException: Fast-saved files are unsupported at this time org.textmining.text.extraction.FastSavedException: Fast-saved files are unsupported at this time at org.textmining.text.extraction.WordExtractor.extractText(WordExtractor.j ava:63) at org.dspace.app.mediafilter.WordFilter.getDestinationStream(WordFilter.ja va:97) at org.dspace.app.mediafilter.MediaFilterManager.processBitstream(MediaFilt erManager.java:668) at org.dspace.app.mediafilter.MediaFilterManager.filterBitstream(MediaFilte rManager.java:570) at org.dspace.app.mediafilter.MediaFilterManager.filterItem(MediaFilterMana ger.java:520) at org.dspace.app.mediafilter.MediaFilterManager.applyFiltersItem(MediaFilt erManager.java:488) at org.dspace.app.mediafilter.MediaFilterManager.applyFiltersAllItems(Media FilterManager.java:427) at org.dspace.app.mediafilter.MediaFilterManager.main(MediaFilterManager.ja va:359) SKIPPED: bitstream 40 (item: 123456789/29) because 'cop?????????2.pdf.txt' already exists -- This message has been scanned for viruses and dangerous content by MailScanner. ------------------------------------------------------------------------ ------ Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensign option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects _______________________________________________ DSpace-tech mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dspace-tech ------------------------------------------------------------------------------ Register Now & Save for Velocity, the Web Performance & Operations Conference from O'Reilly Media. Velocity features a full day of expert-led, hands-on workshops and two days of sessions from industry leaders in dedicated Performance & Operations tracks. Use code vel09scf and Save an extra 15% before 5/3. http://p.sf.net/sfu/velocityconf _______________________________________________ DSpace-tech mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dspace-tech

