Hi Panyarak,

The first of your errors is because a bitstream cannot be filtered. In
this case the bitstream (a Word .doc file) has been saved using the
'fast save' option. See: http://support.microsoft.com/kb/197978 for an
explanation of what this means.

The second error "SKIPPED: bitstream 40 (item: 123456789/29) because
'cop?????????2.pdf.txt' already exists" is not really an error. It just
means that the item has already had its text extracted and indexed so it
skips the file.

Thanks,


Stuart Lewis
Digital Services Programmer
Te Tumu Herenga The University of Auckland Library
Auckland Mail Centre, Private Bag 92019, Auckland 1142, New Zealand
Ph: 64 9 373-7599 x81928
http://www.library.auckland.ac.nz/




-----Original Message-----
From: Panyarak Ngamsritragul [mailto:[email protected]] 
Sent: Monday, 27 April 2009 1:46 p.m.
To: [email protected]
Subject: [Dspace-tech] Error in Applying Media Filters


Hi all,

I set the DSpace to run $DSPACE/bin/filter-media as a cron job everyday.

There are some errors generated as attached.  Could someone tell what is

going wrong ?

I am using 1.5.2 in a Linux box.

Thanks.

-- 
Panyarak Ngamsritragul
Department of Mechanical Engineering
Prince of Songkla University.

=========== Part of message from cron job ===============
Applying Media Filters
ERROR filtering, skipping bitstream:

         Item Handle: 123456789/6
         Bundle Name: ORIGINAL
         File Size: 95232
         Checksum: 16cf45ecdfa77470e71443295630cbf0 (MD5)
         Asset Store: 0
org.textmining.text.extraction.FastSavedException: Fast-saved files are
unsupported at this time
org.textmining.text.extraction.FastSavedException: Fast-saved files are
unsupported at this time
         at
org.textmining.text.extraction.WordExtractor.extractText(WordExtractor.j
ava:63)
         at
org.dspace.app.mediafilter.WordFilter.getDestinationStream(WordFilter.ja
va:97)
         at
org.dspace.app.mediafilter.MediaFilterManager.processBitstream(MediaFilt
erManager.java:668)
         at
org.dspace.app.mediafilter.MediaFilterManager.filterBitstream(MediaFilte
rManager.java:570)
         at
org.dspace.app.mediafilter.MediaFilterManager.filterItem(MediaFilterMana
ger.java:520)
         at
org.dspace.app.mediafilter.MediaFilterManager.applyFiltersItem(MediaFilt
erManager.java:488)
         at
org.dspace.app.mediafilter.MediaFilterManager.applyFiltersAllItems(Media
FilterManager.java:427)
         at
org.dspace.app.mediafilter.MediaFilterManager.main(MediaFilterManager.ja
va:359)

SKIPPED: bitstream 40 (item: 123456789/29) because
'cop?????????2.pdf.txt' 
already exists

-- 
This message has been scanned for viruses and
dangerous content by MailScanner.


------------------------------------------------------------------------
------
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensign option that enables unlimited
royalty-free distribution of the report engine for externally facing

server and web deployment.
http://p.sf.net/sfu/businessobjects
_______________________________________________
DSpace-tech mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-tech

------------------------------------------------------------------------------
Register Now & Save for Velocity, the Web Performance & Operations 
Conference from O'Reilly Media. Velocity features a full day of 
expert-led, hands-on workshops and two days of sessions from industry 
leaders in dedicated Performance & Operations tracks. Use code vel09scf 
and Save an extra 15% before 5/3. http://p.sf.net/sfu/velocityconf
_______________________________________________
DSpace-tech mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-tech

Reply via email to