Hi all, I'm writing this message so we can gain some input about a petition we have at UAB that could be potentially useful for other users: implementing METS and PREMIS in Invenio.
I'm attaching a crash course on METS, MODS, PREMIS and MIX at the end of this message for the benefit of those who haven't had a chance to look at them. My question is whether PREMIS and METS are in the Invenio pipeline (although I haven't seen them at https://savannah.cern.ch/task/?group=cdsware) and/or collect some preliminary ideas about wether it could be implementable, and how. >From what I have read, PREMIS should not be mixed in descriptive metadata (MARCXML in Invenio case). My first, preliminary conclusion is that it'd better be in separate tables, and data would be `pulled of' only if needed, wether in basic Web browsing or via OAI server. Technical details of the digital objects should better be automatically extracted via software (ex., ImageMagick or JHOVE). Permisions and copyright issues are dealt also separatelly. In our case, the Spanish Ministry of Culture is offering grants for old journal digitalisation to improve access to historical press (http://prensahistorica.mcu.es/), and METS and PREMIS compliance give 'extra points', so to speak. I know it can be hard to say anything with this little information, but I'd like to hear about CERN ideas about this issue (and sooner better than later, given our timetable ;-) Thanks a lot, Ferran --- Crash course on METS, MODS, MIX and PREMIS First of all, all those standards are endorsed by the Library of Congress. In their standards page (http://www.loc.gov/standards/) there is a one-sentence description for each of them, plus all the details in there respective pages. However, it took me a while until I `got' them and put all them into perspective, and this is the humble purpose of those paragraphs. Please take them very cautiously; I've just learned them and I'm not any expert. That said, here we go: In the world of digital preservation, there is an agreement that is necessary to keep metadata of several kinds for each digital object, so preservation policies can be applied, now or in the future. This metadata can (or must) be of several kinds - Descriptive: examples are the well known MARC or MARCXML, Dublin Core or MODS. MODS (Metadata Object Description Schema, http://www.loc.gov/standards/mods/) is, roughly said, a subset of MARC21, but richer than Dublin Core. Invenio alreay provides two of them, no problem here, and an optional MODS output (http://www.loc.gov/standards/mods/mods-mapping.html) can be worked out when XML bibformats stabilise. - Administrative: including rights and permissions, provenance (origin) and structural. The preservation ones are expressed in PREMIS (http://www.loc.gov/standards/premis/). - Technical, such as image (http://www.loc.gov/standards/mix/) or text details (textMD) and METS (http://www.loc.gov/standards/mets/) basically wraps all them together.