Hi !

On Wed, Jun 29, 2011 at 3:44 PM, Piotr Praczyk <[email protected]>wrote:

>  In this case of no controversies, I am going to implement this schema.
>
Just in order to encourage your idea (..with no controversies !..), I was
trying to create a valid METS file, merging your data into an official METS
sample: please take a look below


> With the plugin infrastructure loading different BibDoc classes depending
> on a type (discussed in different discussion), we can even allow different
> meta-data formats inside METS.
>
I agree with the result you look for.
I only add this: in the future, in order to have a proper compliance with
METS, the TYPE attribute values (and many other details) could be referred
to some controlled vocabularies, either choosing from an existing
METS.profile or proposing a "CERN.METS.profile" to the Library of Congress
(why not :-).
Indeed the "BibObject" type does not exist at the moment, but I think it
could be considered only a matter of "internal convention" in the use of
METS, taking advantage of its flexibility ("*DIV.TYPE attribute values
include: chapter, article, page, track, segment, section etc. METS places no
constraints on the possible TYPE values. Suggestions for controlled
vocabularies for TYPE may be found on the METS website")*


>  ------------------------------
> *From:* Piotr Praczyk
> *Sent:* 24 June 2011 19:51
> *To:* project-cdsware-developers (CDS Invenio developers); Salvatore Mele;
> Francisco Javier Nogueras Iso
> *Subject:* partial METS support for storing figures
>

> Implementation of this will make Invenio input METS-like, but it will not
> allow input from other repositories.
>
In my opinion there's nothing strange in supporting METS only "in export"
(and not "in import")... just to begin.


> It is very very unlikely that data coming from other repositories will be
> formatted according to this schema so maybe it is better not to make it
> METS-like
> so that nobody gets confused that gerneral METS is supported ?
>
> Do I understand correctly that every METS document is supposed to describe
> exactly one object ?  This would make the example break conventions of METS.
>
I think that one METS object is a "descriptive-unit" within which you can
refer to many documents (and many descriptive levels, and many many files).
Also in the Library-of-Congress official example we shared (
http://lcweb2.loc.gov/diglib/ihas/loc.natlib.gottlieb.09601/contactsheet.html)
we find two document-versions described both in the technical-metadata
("fileSec" and "structMap") and in the descriptive-metadata (MARC/MODS)


>
> Here how I imagine mapping of described earleir objects:
>
> - The structMap being a container for objects (every top-level division
> would be interpreted as one of  entities described in one of previous
> e-mails)
>
>     - Uploading/modifying Bibobjects
>         A division having a type "BibObject" would represent a single
> BibObject.
>
>         The ID field can be used for encoding versions, permanent ids and
> temporary ids (depending on prefixes)
>
Your idea (for IDs construction) really looks like the convention adopted
for "ID attribute" in the METS.profile of the Fedora project (
http://fedora-commons.org/download/2.1.1/userdocs/digitalobjects/rulesForMETS.html):
*The recommended convention is ID=”DSn.v” where n is the number of the
datastream and v is the version number (for example ID=DS1.0 or ID=DS1.1)*

        BibObject division should not contains subdivisions, only <fptr>
> links to files appearing in BibObject.
>
>     example:
>
>    <div id="tmp:NewObject1" type="BibObject">
>      <div TYPE="version" ID="tmp:NewObject1:newVersion">
>         <fptr ... />
>         <fptr ... />
>      </div>
>    </div>
>
>
>     or new version of an existing record:
>
>   <div id="objectId:123" type="BibObject">
>      <div TYPE="version" ID="objectId:123:newVersion">
>         <fptr ... />
>      </div>
>    </div>
>
>
>     - relations between versions of BibDocs implemented using structLink
> element
>
>     for example:
>
>     Linking using temporary identifiers:
>    <structLink xlink:from="tmp:NewObject1:newVersion"
> xlink:to="objId:232314:3" xlink:arcrole="is_extracted_from" DMID="link to
> metadata in serialised moreinfo " />
>
>
>     Linking already existing entities:
>    <structLink xlink:from="objId:2334234:5" xlink:to="objId:232314:6"
> xlink:arcrole="is_extracted_from" DMID="link to metadata in serialised
> moreinfo " />
>
>
>     Similarly, we could provide relations between particular bibdocfiles.
>
Please, take a look to the attached METS file: it's ABSOLUTELY NOT a good
and definitive example for all the needs about METS support in Invenio
(...it's not referred to a suitable METS-profile...), but it supports the
connection between the "objectId123" and "objectId123newVersion" files; with
respective descriptions into the MARC/MODS metadata; and different file
format (jpg/tif) for each object.
You can find the original sources of the file in the comments. I validated
it with http://www.validome.org/xml/ (some more strict parsers give more
problems, but the idea was important ;-)

- storing more-info in Descriptive metadata objects in "binary" format
> (serialised JSON) and assigning using DMDID to entities.
>   This would be simple to implement but clearly would make this part of
> METS files non-readable
>
In any case, we know that METS can wrap entire bin record or instruction or
code (or other XML) into "binData" element:
http://www.loc.gov/standards/mets/docs/mets.v1-9.html#binData

I hope that my mail confirm your proposed solution!

Cheer,
Cristian
<?xml version="1.0" encoding="UTF-8"?>
<!-- METS original record from the Library of Congress record http://lcweb2.loc.gov/diglib/ihas/loc.natlib.gottlieb.09601/contactsheet.html -->
<mets:mets 
xmlns:lc="http://www.loc.gov/mets/profiles/"; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"; 
xmlns:mods="http://www.loc.gov/mods/v3"; 
xmlns:xlink="http://www.w3.org/1999/xlink"; 
xmlns:mets="http://www.loc.gov/METS/"; 
xmlns:idx="info:lc/xq-modules/lcindex" 
xsi:schemaLocation="http://www.loc.gov/METS/ http://www.loc.gov/standards/mets/mets.xsd";
xmlns:mxe="http://www.loc.gov/mxe";>

<!-- MARC21 record from CERN Document Server http://cdsweb.cern.ch/record/1362001/export/xm -->
  <mets:dmdSec ID="dmd1">
    <mets:mdWrap MDTYPE="MODS">
      <mets:xmlData>
   <mods:mods xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
                xmlns="http://www.loc.gov/mods/v3";
                xsi:schemaLocation="http://www.loc.gov/mods/v3 http://www.loc.gov/standards/mods/v3/mods-3-4.xsd"; ID="objectId123">
      <mods:titleInfo>
         <mods:title>Leçons sur la théorie mathématique de l'élasticité des corps solides</mods:title>
      </mods:titleInfo>
      <mods:name type="personal" usage="primary">
         <mods:namePart>Lame, Gabriel</mods:namePart>
      </mods:name>

      <mods:typeOfResource/>
      <mods:originInfo>
         <mods:place>
            <mods:placeTerm type="text">Charleston, SC</mods:placeTerm>
         </mods:place>
         <mods:publisher>Bibliolife</mods:publisher>
         <mods:dateIssued>2009</mods:dateIssued>

      </mods:originInfo>
      <mods:language>
         <mods:languageTerm authority="iso639-2b" type="code">fre</mods:languageTerm>
      </mods:language>
      <mods:physicalDescription>
         <mods:extent>335 p</mods:extent>
      </mods:physicalDescription>
      <mods:subject authority="SzGeCERN">

         <mods:topic>Mathematical Physics and Mathematics</mods:topic>
      </mods:subject>
      <mods:classification authority="udc">539.52</mods:classification>
      <mods:identifier type="isbn">9781113042682</mods:identifier>
      <mods:recordInfo>
         <mods:recordIdentifier source="SzGeCERN">1362001</mods:recordIdentifier>

<!-- 	The original MARC21 CERN record is converted towards MODS using the Library of Congress XSL http://www.loc.gov/standards/mods/v3/MARC21slim2MODS3-4.xsl
	described in http://www.loc.gov/standards/mods/mods-conversions.html
	using saxonb-xslt transformer -->
         <mods:recordOrigin>Converted from MARCXML to MODS version 3.4 using MARC21slim2MODS3-4.xsl
				(Revision 1.70)</mods:recordOrigin>

      </mods:recordInfo>
	<!-- Addition to the original CERN record, (taken from the original LoC-METS) to manage the file versions - START -->
	<mods:relatedItem type="otherVersion" ID="objectId123newVersion">
		<mods:note type="version">contact print with annotations</mods:note>
	</mods:relatedItem>
	<!-- END   -->
   </mods:mods>
      </mets:xmlData>
    </mets:mdWrap>
  </mets:dmdSec>

  <mets:fileSec>
    <mets:fileGrp USE="MASTER">
      <mets:file MIMETYPE="image/tiff" GROUPID="G1" ID="m1">
	<mets:FLocat LOCTYPE="URL" xlink:href="http://lcweb2.loc.gov/natlib/ihas/warehouse/gottlieb/09601/ver01/0001.tif"/>
      </mets:file>
      <mets:file MIMETYPE="image/tiff" GROUPID="G1" ID="m2">
	<mets:FLocat LOCTYPE="URL" xlink:href="http://lcweb2.loc.gov/natlib/ihas/warehouse/gottlieb/09601/ver02/0001.tif"/>
      </mets:file>
    </mets:fileGrp>
    <mets:fileGrp USE="SERVICE">
      <mets:file MIMETYPE="image/jpeg" GROUPID="G1" ID="s1">
	<mets:FLocat LOCTYPE="URL" xlink:href="http://lcweb2.loc.gov/natlib/ihas/service/gottlieb/09601/ver01/0001v.jpg"/>
      </mets:file>
      <mets:file MIMETYPE="image/jpeg" GROUPID="G1" ID="s2">
	<mets:FLocat LOCTYPE="URL" xlink:href="http://lcweb2.loc.gov/natlib/ihas/service/gottlieb/09601/ver02/0001v.jpg"/>
      </mets:file>
    </mets:fileGrp>
  </mets:fileSec>

  <mets:structMap>
    <mets:div DMDID="objectId123" TYPE="BibObject">
      <mets:div TYPE="BibObject:version" DMDID="objectId123">
	<mets:div TYPE="BibObject:image">
	  <mets:fptr FILEID="m1"/>
	  <mets:fptr FILEID="s1"/>
	</mets:div>
      </mets:div>
      <mets:div TYPE="BibObject:version" DMDID="objectId123newVersion">
	<mets:div TYPE="BibObject:image">
	  <mets:fptr FILEID="m2"/>
	  <mets:fptr FILEID="s2"/>
	</mets:div>
      </mets:div>
    </mets:div>
  </mets:structMap>
</mets:mets>

Reply via email to