I would like to ping Mark Wood's questions on this thread one more time ... 
Mark and I use very different language for describing what we want to do 
... as a repository manager I want to:

1. automatically register DOIs (the manually process is tedious)
2. store those DOIs in metadata fields that are meaningful to both machines 
and to people (but, yes, machines are probably more important, in this case)
3. do the above without modifying dspace such that future upgrades are a 
pain in the neck.

I agree with the few repository managers that responded that DOIs are best 
stored in dc.identifier.doi ... and that external DOIs (those that we do 
not register) should be stored in another field (version.isrelationof, for 
example) ... but this is the human readable solution. I assume that those 
that responded with this arrangement are registering DOIs manually or are 
at the very least not using dspace to make the registration.

It makes sense to me (sort of) as Mark says: "A system which creates 
identifiers for its own purposes must know which identifiers it controls." 
... which means for now, dspace should store these in dc.identifier.uri. 
But ... 

Can anyone confirm that we are not creating downstream headaches for 
systems that seek to make sense of the multiple values stored in 
dc.identifier.uri? Or ... as Mark says:

"What is the appropriate, standardized or generally accepted mapping of 
"DOI for this version of a resource" for interchange among heterogeneous 
systems?"

AND

"[N]on-brittle external systems will [parse the different types of 
identifiers] anyway to protect themselves from unknown practices at sites 
that they harvest.  Do we know of any systems which do not?"

Any thoughts on these questions?

Jere Odell
IUPUI





On Thursday, October 11, 2018 at 12:36:08 PM UTC-4, Mark Wood wrote:
>
> On Thursday, October 11, 2018 at 11:59:56 AM UTC-4, Jere Odell wrote:
>>
>> I think there's mismatch between how librarians think metadata should be 
>> applied and how DSpace can auto-register (DataCite) DOIs. If Mark and 
>> Claudia are correct, DSpace generates DOIs in dc.identifier.uri and 
>> [cannot/is not currently able to] register DOIs from other Dublin Core 
>> fields ... such as dc.identifier.doi.
>>
>> If I understand correctly, DSpace was designed to issue one persistent 
>> identifier ... the handle. DOIs were a more recent request and, for now, if 
>> we want to auto-generate DOIs we have to store them in dc.identifier.uri. 
>> Is that correct?
>>
>> If so, that puts those of us that want to assign DOIs to our DSpace 
>> records in a difficult spot ... we must choose between a) manual methods of 
>> registering the DOI or b) rely on a less-than-optimal metadata practice.
>>
>> Am I missing something?
>>
>>
>
> Perhaps it is I who is missing something.  How, specifically, is this 
> less-than-optimal practice?  Some points to consider:
>
> o  There actually is no such field as identifier.uri in Qualified Dublin 
> Core.  So what would an aggregator do with it?  It has no meaning outside 
> of DSpace.  It should be mapped to something standardized, when exposed to 
> harvesters.  Screen-scraping harvesters should know they are on shaky 
> ground and carefully examine the values that they find.
>
> o  Resolvable URLs for DOIs and for general Handles use distinct 
> authorities (hdl.handle.net vs. dx.doi.org).  They are easily 
> distinguished by humans and by machines.
>
> o  If a raw Handle has the prefix "10." then it is a DOI, otherwise it is 
> not.
>
> o  How a repository stores a metadata value, and how it presents it, are 
> separate questions.  What is the appropriate, standardized or generally 
> accepted mapping of "DOI for this version of a resource" for interchange 
> among heterogeneous systems?
>
> o  A system which creates identifiers for its own purposes must know which 
> identifiers it controls.  Others must know which identifiers they do not 
> control.  I presume that this is why the DOI identifier providers use one 
> field and the stock submission form uses another.
>
> I would have preferred that different types of identifiers were stored 
> separately, so we don't have to parse them to know what they are.  But that 
> isn't difficult, and non-brittle external systems will do that anyway to 
> protect themselves from unknown practices at sites that they harvest.  Do 
> we know of any systems which do not?
>

-- 
All messages to this mailing list should adhere to the DuraSpace Code of 
Conduct: https://duraspace.org/about/policies/code-of-conduct/
--- 
You received this message because you are subscribed to the Google Groups 
"DSpace Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/dspace-community.
For more options, visit https://groups.google.com/d/optout.

Reply via email to