I'm trying to retrieve the extracted text bitstream associated with items.
Is there a way to get a list of them from the database?

So far, I've only been able to generate a list of all bitstreams with:

SELECT i.item_id, last_modified, owning_collection, internal_id,
t.text_value AS title
FROM item i
JOIN item2bundle i2b
ON i.item_id = i2b.item_id
JOIN bundle2bitstream b2b
ON b2b.bundle_id = i2b.bundle_id
JOIN bitstream b
ON b.bitstream_id = b2b.bitstream_id
JOIN metadatavalue d
ON d.resource_id = i.item_id
JOIN metadatavalue t
ON t.resource_id = i.item_id
WHERE in_archive = 't' AND withdrawn = 'f' AND discoverable = 't'
AND d.metadata_field_id = 11 AND d.text_value >= '2021-01' AND d.text_value
< '2021-12'
AND t.metadata_field_id = 64
ORDER BY owning_collection

That gives me a list including the internal_id, which I can use to
determine where the file is in the assetstore:
77274565375792968793874045792320511138 =
/dspace/assetstore/77/27/45/77274565375792968793874045792320511138

But I've noticed some gaps, like id 4117, which has both a PDF and an
extracted text bitstream, but in the assetstore, there's only the PDF in
that directory:
$ ls /dspace/assetstore/77/27/45/
77274565375792968793874045792320511138

How can I determine the location of the associated text extract bitstream
for that item?

Sean

DSpace version:  CRIS-5.10.0-SNAPSHOT
  SCM revision:  67e7d010e7eda86925980b2a43581b9d4f4929a3
    SCM branch:  dspace-5_x_x-cris
            OS:  Linux(amd64) version 4.4.0-210-generic
  Applications:
     Discovery:  enabled.
           JRE:  Private Build version 1.8.0_292
   Ant version:  Apache Ant(TM) version 1.9.6 compiled on July 20 2018
 Maven version:  3.3.9

-- 
All messages to this mailing list should adhere to the Code of Conduct: 
https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx
--- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dspace-tech/CA%2BxAuhPLrCGLxbUvCM1CgZN1FDjN8Cum7ZwrfA-yrcgVzC_83A%40mail.gmail.com.

Reply via email to