I'm trying to retrieve the extracted text bitstream associated with items.
Is there a way to get a list of them from the database?
So far, I've only been able to generate a list of all bitstreams with:
SELECT i.item_id, last_modified, owning_collection, internal_id,
t.text_value AS title
FROM item i
JOIN item2bundle i2b
ON i.item_id = i2b.item_id
JOIN bundle2bitstream b2b
ON b2b.bundle_id = i2b.bundle_id
JOIN bitstream b
ON b.bitstream_id = b2b.bitstream_id
JOIN metadatavalue d
ON d.resource_id = i.item_id
JOIN metadatavalue t
ON t.resource_id = i.item_id
WHERE in_archive = 't' AND withdrawn = 'f' AND discoverable = 't'
AND d.metadata_field_id = 11 AND d.text_value >= '2021-01' AND d.text_value
< '2021-12'
AND t.metadata_field_id = 64
ORDER BY owning_collection
That gives me a list including the internal_id, which I can use to
determine where the file is in the assetstore:
77274565375792968793874045792320511138 =
/dspace/assetstore/77/27/45/77274565375792968793874045792320511138
But I've noticed some gaps, like id 4117, which has both a PDF and an
extracted text bitstream, but in the assetstore, there's only the PDF in
that directory:
$ ls /dspace/assetstore/77/27/45/
77274565375792968793874045792320511138
How can I determine the location of the associated text extract bitstream
for that item?
Sean
DSpace version: CRIS-5.10.0-SNAPSHOT
SCM revision: 67e7d010e7eda86925980b2a43581b9d4f4929a3
SCM branch: dspace-5_x_x-cris
OS: Linux(amd64) version 4.4.0-210-generic
Applications:
Discovery: enabled.
JRE: Private Build version 1.8.0_292
Ant version: Apache Ant(TM) version 1.9.6 compiled on July 20 2018
Maven version: 3.3.9
--
All messages to this mailing list should adhere to the Code of Conduct:
https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx
---
You received this message because you are subscribed to the Google Groups
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/dspace-tech/CA%2BxAuhPLrCGLxbUvCM1CgZN1FDjN8Cum7ZwrfA-yrcgVzC_83A%40mail.gmail.com.