If there are more than one document for this handle, then you can
identify which one it is by looking at column "size_bytes" (file size
below) in the bitstream table. Below is a sql query I use to list
information from the bitstream, bundle, item, and handle tables, when I
know one piece of information, say - the handle - and don't know the
rest (you can modify the query as needed). It's useful in the situation
below, in listing the data in the bitstream table when you only know the
handle. Hope this helps.
Sue
select bi.* from
bitstream bi
, bundle2bitstream b2b
, bundle bu
, item2bundle i2b
, item it
, handle ha
where ha.resource_id = it.item_id
and it.item_id = i2b.item_id
and i2b.bundle_id = bu.bundle_id
and bu.bundle_id = b2b.bundle_id
and b2b.bitstream_id = bi.bitstream_id
and ha.handle = '2121/169402'
-----Original Message-----
From: Mark H. Wood [mailto:[email protected]]
Sent: Friday, March 06, 2009 9:03 AM
To: [email protected]
Subject: Re: [Dspace-tech] error running filter-media script
On Thu, Mar 05, 2009 at 03:58:52PM -0600, Jewel wrote:
> I am running Dspace version 1.5.1 on a Windows 2003 box. We have
loaded
> very little into our collection. I can't make out what the error
means.
> Below is the error I receive after running: dsrun
> org.dspace.app.mediafilter.MediaFilterManager
> /
> E:\dspace\bin>dsrun org.dspace.app.mediafilter.MediaFilterManager
> Using DSpace installation in: E:\dspace
> ERROR filtering, skipping bitstream:
>
> Item Handle: 10425/53
> Bundle Name: ORIGINAL
> File Size: 11301578
> Checksum: 4a6333832dc9b7ee8704b2c0ec735bbe (MD5)
> Asset Store: 0
> java.io.IOException: Invalid header signature; read
3759996809423114277,
> expected -2226271756974174256
> java.io.IOException: Invalid header signature; read
3759996809423114277,
> expected -2226271756974174256
> at
>
org.apache.poi.poifs.storage.HeaderBlockReader.<init>(HeaderBlockReader.
java:88)
It sure would be nice if the message indicated which bitstream had the
problem, no? It appears that one of the bitstreams attached to item
53 is either a corrupt MS Office document, or is not an MS Office
document at all but DSpace believes it is one. (POI is the library
that DSpace uses to extract text from MS Word documents.)
If there is only one Office document attached to item 53, that is the
culprit. If there are more than one, examine each until you find the
problematic one. If there are no bitstreams that should be treated as
Office documents, check the associated format of each bitstream to see
if it matches the content type you would expect.
--
Mark H. Wood, Lead System Programmer [email protected]
Friends don't let friends publish revisable-form documents.
------------------------------------------------------------------------------
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
_______________________________________________
DSpace-tech mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-tech