Dear list,

I've implemented a curation task to read country names from item metadata
and add new metadata fields with appropriate ISO 3166-1 Alpha2 codes if
they don't already exist. On DSpace 5 the task finishes in an hour or
sometimes two, but on DSpace 6 it runs for twelve hours and I end up
killing it. As far as I can tell I ported the DSpace 5 version¹ to DSpace 6
faithfully², though I'm wondering if I missed something with regards to
caching, as that seems to have been removed (or internalized) with the
service API / Hibernate overhaul. I would be grateful if someone could take
a look.

Another thing I note is that when I do "-i all" to process all items in the
repository the curation task will curate each item multiple times, one for
each collection it is mapped to. Our repository has ~90,000 items and in
our case that results in reprocessing ~25,000 items(!). Would it be better
to write a standalone Java utility for this rather than using the curation
interface?

Thank you,

¹
https://github.com/ilri/cgspace-java-helpers/blob/dspace5/src/main/java/io/github/ilri/cgspace/ctasks/CountryCodeTagger.java
²
https://github.com/ilri/cgspace-java-helpers/blob/dspace6/src/main/java/io/github/ilri/cgspace/ctasks/CountryCodeTagger.java

-- 
Alan Orth
[email protected]
https://picturingjordan.com
https://englishbulgaria.net
https://mjanja.ch

-- 
All messages to this mailing list should adhere to the DuraSpace Code of 
Conduct: https://duraspace.org/about/policies/code-of-conduct/
--- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dspace-tech/CAKKdN4WvakwzZ%2Bu4pM_jdFzYRBTqWovE8b6T%2Bt7-Xv1WQctZoQ%40mail.gmail.com.

Reply via email to