Thanks for all the responses, and insight thus far.

My thinking about the problem space of nested metadata, is that that is
trying to solve the problem of metadata about metadata about an item.
Item/1234 => Author => [Peter Dietz, Developer, Longsight, Brown eyes, ...]

What we've been doing is that all metadata objects are really of one type,
and thats text. We don't care what it is, we don't validate it, we just
store the text value of whatever you've given us. Some of the problems that
we run into is for date fields. They try to parse the text, and its not a
valid/parseable date. Date => Unknown, Date => 2015/04, Date => 1960's,
Date => 08/04/2015. We could/should validate things stored in the date-type
of metadata fields as ISO8601.

So, for storing other types of information. In the case of tying a metadata
field to be backed by some authority control system, we store a foreign key
/ reference, and then SOLR stores an encoding of the metadata we fetch from
the metadata service provider. In the case of the ORCID integration it can
grab:

givenNames,  familyName,  creditName,  otherNames, country, keyword,
external_identifier, researcher_url, biography

So we have a form of a schema for storing this "object" inside of a
metadata value. Our current metadata system is basically a key/value store.
key = metadata field (i.e. dc.title), and value is unspecified, but usually
just text. Could we validate that we have a type called
_nested_orcid_author, which has to be json, and only contain the above
fields? That looks like an object, an OrcidAuthor object. We'd need a
schema to enforce that. But then we're building tables and classes for that
field. Maybe some type of key/key/value store would be appropriate?
dc.author => {{ _nested_metadata_object }}

Then a NestedMetadataObject can have keys (metadata_field_id) , and values
(unspecified text).

So.
NestedMetadataValues nmValues = item.getMetadata('dc.author');
nmValue0 = nmValues[0];
nmValue0.getMetadata('dc.author.firstname') ==> "Peter Dietz"


That approach. Or is it best to stick with the authority framework. Build
some type of MetadataAuthorityProvider for each "rich" / "nested" metadata
object? But, if I need to have 10 fields that each need a metadata
authority backing store... And there is no Library of Congress metadata
service provider for each, do you need to construct your own metadata
silos? Could you build a single external metadata service provider system,
that could be integrated with DSpace, and be mapped to 10 different fields?
Author (firstname, lastname, institution), Review(# of stars, title,
description), Link(link name, url), ScientificClassification(Kingdom,
Phylum, Class, Order, Suborder, Family, Genus, Species), ...

For reference, I've stored some items with the value for author serialized
as JSON.
https://trydspace.longsight.com/handle/123456789/175
https://trydspace.longsight.com/rest/handle/123456789/175?expand=all
  "metadata": [
    {
      "key": "dc.contributor.author",
      "value": "{firstname:Mary Davis, lastname:MacNaughton, role:Editor}",
      "language": ""
    },
    {
      "key": "dc.contributor.author",
      "value": "{firstname:Michael, lastname:Duncan, role:Contributor}",
      "language": ""
    },

http://dspace-rest-client-play.herokuapp.com/item/202
http://dspace-rails.herokuapp.com/item/202

Or, is flat metadata really best? Do you really need DSpace to store
metadata about metadata (i.e. Author.eye-color), or is storing "Dietz,
Peter" sufficient, or just our current limitation.


________________
Peter Dietz
Longsight
www.longsight.com
pe...@longsight.com
p: 740-599-5005 x809

On Thu, Jul 30, 2015 at 9:06 AM, Mark H. Wood <mw...@iupui.edu> wrote:

> On Wed, Jul 29, 2015 at 04:06:19PM -0400, Peter Dietz wrote:
> > Has anyone stored nested / rich metadata in DSpace?
> >
> > An example I'm thinking of is for storing richer amounts of metadata for
> an
> > object. For example:
> >
> >    - Author
> >       - first-name: Peter
> >       - last-name: Dietz
> >       - name-as-it-appears: Peter Dietz
> >       - institution: Longsight
> >       - date-of-birth: ...
> >       - ...
> >    - Author
> >       - first-name: Sam
> >       - last-name: Ottenhoff
> >       - ...
> >
> > The Authority Control system of DSpace looks like it approaches this, but
> > the documentation isn't clear, and I'm not sure if it requires that your
> > data values reside in some Library of Congress registry.
>
> You can create other authority providers.  (The documentation is
> indeed sketchy.  The code is in
> dspace-api:org.dspace.content.authority.  Sadly there is no
> package-level documentation to help us understand how the package is
> organized.)
>
> > The hack-job I have in mind would be to serialize the information... to
> > json... and then store that into a metadata field.
> >
> > So.
> > schema.author.serialized = {first-name: "Peter", last-name: "Dietz",
> > "name-as-it-appears" : "Peter Dietz", "institution": "Longsight", ... }
> >
> > However, I'm tempted to think that DSpace should either have the ability
> to
> > plug into any registry (hopefully there are registries you can populate
> and
> > maintain with your own local data), or to extend DSpace's metadata data
> > model to support nested/rich data.
>
> DSpace already has infrastructure sufficient to represent the above.
> We just don't define:
>
>       somenamespace.person.givenname
>       somenamespace.person.surname
>       somenamespace.person.preferred
>       somenamespace.person.affiliation
>       somenamespace.person.dob
>
> That part is easy to fix.  The hard part is that DSpace treats author
> names as immediate strings rather than identifiers for related
> "person" objects.  Fixing that will take a bit of work.  It ties in
> with existing and ongoing work to integrate ORCID, too.
>
> --
> Mark H. Wood
> Lead Technology Analyst
>
> University Library
> Indiana University - Purdue University Indianapolis
> 755 W. Michigan Street
> Indianapolis, IN 46202
> 317-274-0749
> www.ulib.iupui.edu
>
>
> ------------------------------------------------------------------------------
>
> _______________________________________________
> DSpace-tech mailing list
> DSpace-tech@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dspace-tech
> List Etiquette:
> https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
>
------------------------------------------------------------------------------
_______________________________________________
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

Reply via email to