Hi !
On Wed, Jun 29, 2011 at 3:44 PM, Piotr Praczyk <[email protected]>wrote:
> In this case of no controversies, I am going to implement this schema.
>
Just in order to encourage your idea (..with no controversies !..), I was
trying to create a valid METS file, merging your data into an official METS
sample: please take a look below
> With the plugin infrastructure loading different BibDoc classes depending
> on a type (discussed in different discussion), we can even allow different
> meta-data formats inside METS.
>
I agree with the result you look for.
I only add this: in the future, in order to have a proper compliance with
METS, the TYPE attribute values (and many other details) could be referred
to some controlled vocabularies, either choosing from an existing
METS.profile or proposing a "CERN.METS.profile" to the Library of Congress
(why not :-).
Indeed the "BibObject" type does not exist at the moment, but I think it
could be considered only a matter of "internal convention" in the use of
METS, taking advantage of its flexibility ("*DIV.TYPE attribute values
include: chapter, article, page, track, segment, section etc. METS places no
constraints on the possible TYPE values. Suggestions for controlled
vocabularies for TYPE may be found on the METS website")*
> ------------------------------
> *From:* Piotr Praczyk
> *Sent:* 24 June 2011 19:51
> *To:* project-cdsware-developers (CDS Invenio developers); Salvatore Mele;
> Francisco Javier Nogueras Iso
> *Subject:* partial METS support for storing figures
>
> Implementation of this will make Invenio input METS-like, but it will not
> allow input from other repositories.
>
In my opinion there's nothing strange in supporting METS only "in export"
(and not "in import")... just to begin.
> It is very very unlikely that data coming from other repositories will be
> formatted according to this schema so maybe it is better not to make it
> METS-like
> so that nobody gets confused that gerneral METS is supported ?
>
> Do I understand correctly that every METS document is supposed to describe
> exactly one object ? This would make the example break conventions of METS.
>
I think that one METS object is a "descriptive-unit" within which you can
refer to many documents (and many descriptive levels, and many many files).
Also in the Library-of-Congress official example we shared (
http://lcweb2.loc.gov/diglib/ihas/loc.natlib.gottlieb.09601/contactsheet.html)
we find two document-versions described both in the technical-metadata
("fileSec" and "structMap") and in the descriptive-metadata (MARC/MODS)
>
> Here how I imagine mapping of described earleir objects:
>
> - The structMap being a container for objects (every top-level division
> would be interpreted as one of entities described in one of previous
> e-mails)
>
> - Uploading/modifying Bibobjects
> A division having a type "BibObject" would represent a single
> BibObject.
>
> The ID field can be used for encoding versions, permanent ids and
> temporary ids (depending on prefixes)
>
Your idea (for IDs construction) really looks like the convention adopted
for "ID attribute" in the METS.profile of the Fedora project (
http://fedora-commons.org/download/2.1.1/userdocs/digitalobjects/rulesForMETS.html):
*The recommended convention is ID=”DSn.v” where n is the number of the
datastream and v is the version number (for example ID=DS1.0 or ID=DS1.1)*
BibObject division should not contains subdivisions, only <fptr>
> links to files appearing in BibObject.
>
> example:
>
> <div id="tmp:NewObject1" type="BibObject">
> <div TYPE="version" ID="tmp:NewObject1:newVersion">
> <fptr ... />
> <fptr ... />
> </div>
> </div>
>
>
> or new version of an existing record:
>
> <div id="objectId:123" type="BibObject">
> <div TYPE="version" ID="objectId:123:newVersion">
> <fptr ... />
> </div>
> </div>
>
>
> - relations between versions of BibDocs implemented using structLink
> element
>
> for example:
>
> Linking using temporary identifiers:
> <structLink xlink:from="tmp:NewObject1:newVersion"
> xlink:to="objId:232314:3" xlink:arcrole="is_extracted_from" DMID="link to
> metadata in serialised moreinfo " />
>
>
> Linking already existing entities:
> <structLink xlink:from="objId:2334234:5" xlink:to="objId:232314:6"
> xlink:arcrole="is_extracted_from" DMID="link to metadata in serialised
> moreinfo " />
>
>
> Similarly, we could provide relations between particular bibdocfiles.
>
Please, take a look to the attached METS file: it's ABSOLUTELY NOT a good
and definitive example for all the needs about METS support in Invenio
(...it's not referred to a suitable METS-profile...), but it supports the
connection between the "objectId123" and "objectId123newVersion" files; with
respective descriptions into the MARC/MODS metadata; and different file
format (jpg/tif) for each object.
You can find the original sources of the file in the comments. I validated
it with http://www.validome.org/xml/ (some more strict parsers give more
problems, but the idea was important ;-)
- storing more-info in Descriptive metadata objects in "binary" format
> (serialised JSON) and assigning using DMDID to entities.
> This would be simple to implement but clearly would make this part of
> METS files non-readable
>
In any case, we know that METS can wrap entire bin record or instruction or
code (or other XML) into "binData" element:
http://www.loc.gov/standards/mets/docs/mets.v1-9.html#binData
I hope that my mail confirm your proposed solution!
Cheer,
Cristian
<?xml version="1.0" encoding="UTF-8"?>
<!-- METS original record from the Library of Congress record http://lcweb2.loc.gov/diglib/ihas/loc.natlib.gottlieb.09601/contactsheet.html -->
<mets:mets
xmlns:lc="http://www.loc.gov/mets/profiles/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:mods="http://www.loc.gov/mods/v3"
xmlns:xlink="http://www.w3.org/1999/xlink"
xmlns:mets="http://www.loc.gov/METS/"
xmlns:idx="info:lc/xq-modules/lcindex"
xsi:schemaLocation="http://www.loc.gov/METS/ http://www.loc.gov/standards/mets/mets.xsd"
xmlns:mxe="http://www.loc.gov/mxe">
<!-- MARC21 record from CERN Document Server http://cdsweb.cern.ch/record/1362001/export/xm -->
<mets:dmdSec ID="dmd1">
<mets:mdWrap MDTYPE="MODS">
<mets:xmlData>
<mods:mods xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="http://www.loc.gov/mods/v3"
xsi:schemaLocation="http://www.loc.gov/mods/v3 http://www.loc.gov/standards/mods/v3/mods-3-4.xsd" ID="objectId123">
<mods:titleInfo>
<mods:title>Leçons sur la théorie mathématique de l'élasticité des corps solides</mods:title>
</mods:titleInfo>
<mods:name type="personal" usage="primary">
<mods:namePart>Lame, Gabriel</mods:namePart>
</mods:name>
<mods:typeOfResource/>
<mods:originInfo>
<mods:place>
<mods:placeTerm type="text">Charleston, SC</mods:placeTerm>
</mods:place>
<mods:publisher>Bibliolife</mods:publisher>
<mods:dateIssued>2009</mods:dateIssued>
</mods:originInfo>
<mods:language>
<mods:languageTerm authority="iso639-2b" type="code">fre</mods:languageTerm>
</mods:language>
<mods:physicalDescription>
<mods:extent>335 p</mods:extent>
</mods:physicalDescription>
<mods:subject authority="SzGeCERN">
<mods:topic>Mathematical Physics and Mathematics</mods:topic>
</mods:subject>
<mods:classification authority="udc">539.52</mods:classification>
<mods:identifier type="isbn">9781113042682</mods:identifier>
<mods:recordInfo>
<mods:recordIdentifier source="SzGeCERN">1362001</mods:recordIdentifier>
<!-- The original MARC21 CERN record is converted towards MODS using the Library of Congress XSL http://www.loc.gov/standards/mods/v3/MARC21slim2MODS3-4.xsl
described in http://www.loc.gov/standards/mods/mods-conversions.html
using saxonb-xslt transformer -->
<mods:recordOrigin>Converted from MARCXML to MODS version 3.4 using MARC21slim2MODS3-4.xsl
(Revision 1.70)</mods:recordOrigin>
</mods:recordInfo>
<!-- Addition to the original CERN record, (taken from the original LoC-METS) to manage the file versions - START -->
<mods:relatedItem type="otherVersion" ID="objectId123newVersion">
<mods:note type="version">contact print with annotations</mods:note>
</mods:relatedItem>
<!-- END -->
</mods:mods>
</mets:xmlData>
</mets:mdWrap>
</mets:dmdSec>
<mets:fileSec>
<mets:fileGrp USE="MASTER">
<mets:file MIMETYPE="image/tiff" GROUPID="G1" ID="m1">
<mets:FLocat LOCTYPE="URL" xlink:href="http://lcweb2.loc.gov/natlib/ihas/warehouse/gottlieb/09601/ver01/0001.tif"/>
</mets:file>
<mets:file MIMETYPE="image/tiff" GROUPID="G1" ID="m2">
<mets:FLocat LOCTYPE="URL" xlink:href="http://lcweb2.loc.gov/natlib/ihas/warehouse/gottlieb/09601/ver02/0001.tif"/>
</mets:file>
</mets:fileGrp>
<mets:fileGrp USE="SERVICE">
<mets:file MIMETYPE="image/jpeg" GROUPID="G1" ID="s1">
<mets:FLocat LOCTYPE="URL" xlink:href="http://lcweb2.loc.gov/natlib/ihas/service/gottlieb/09601/ver01/0001v.jpg"/>
</mets:file>
<mets:file MIMETYPE="image/jpeg" GROUPID="G1" ID="s2">
<mets:FLocat LOCTYPE="URL" xlink:href="http://lcweb2.loc.gov/natlib/ihas/service/gottlieb/09601/ver02/0001v.jpg"/>
</mets:file>
</mets:fileGrp>
</mets:fileSec>
<mets:structMap>
<mets:div DMDID="objectId123" TYPE="BibObject">
<mets:div TYPE="BibObject:version" DMDID="objectId123">
<mets:div TYPE="BibObject:image">
<mets:fptr FILEID="m1"/>
<mets:fptr FILEID="s1"/>
</mets:div>
</mets:div>
<mets:div TYPE="BibObject:version" DMDID="objectId123newVersion">
<mets:div TYPE="BibObject:image">
<mets:fptr FILEID="m2"/>
<mets:fptr FILEID="s2"/>
</mets:div>
</mets:div>
</mets:div>
</mets:structMap>
</mets:mets>