Alistair Miles
Wed, 14 Jan 2009 00:44:15 -0800
Hi Ed, thanks a lot, yes a hand would be very much appreciated :) I'm just getting my bearings with MARC and MODS, I'll come back with some suggestions re where to go next asap.
In the mean time, couple of quick questions... When I run yaz-marcdump over one of the loc data files (e.g. part29.dat), it spits out lots of "record" elements but no root element, i.e. the output is not well-formed XML. Is this a side-effect of the data being broken into parts? Would it be correct to nest the "record" elements inside a "collection" root element? I was looking at the sample MODS transformation of a MARC record found at http://www.loc.gov/standards/marcxml/Sandburg/sandburgmods.xml linked from http://www.loc.gov/standards/marcxml/// ... when I open this in oXygen I get [SystemID:http://www.loc.gov/standards/mods/mods.xsd Severity: error Line:3 Column:247 EndLine:-1 EndColumn:-1 Length:-1 Offset:-1 Message:TargetNamespace.1: Expecting namespace 'http://www.loc.gov/mods/', but the target namespace of the schema document is 'http://www.loc.gov/mods/v3'.] I guess this example file is out of synch with more recent developments in the schema? Cheers, Alistair On Mon, Dec 22, 2008 at 03:31:50PM -0500, Ed Summers wrote: > Hey Alistair: > > On Mon, Dec 22, 2008 at 1:16 PM, Alistair Miles > <alistair.mi...@zoo.ox.ac.uk> wrote: > > Any tips for how I could turn these data into RDF? > > If you want to work specifically with that dataset you could download > the different parts Karen pointed you to, and convert to MARCXML using > an efficient tool like yaz-marcdump [2]. yaz-marcdump is nice it will > convert from MARC-8 to UTF-8. > > Once you've got it in MARCXML you could then use a stylesheet like > LC's [2] to convert to DublinCore flavored RDF. This might be kinda > lossy for your RDA work though, so you might want MARCXML->MODS [3], > and then use the MODS->RDF conversion that the Simile folks created > (which Karen also pointed you to) [4]. > > In fact Simile used that stylesheet on their own MIT Library Catalog > MARC data (Barton) and still seem to have the result online [5]. So > perhaps just using the Barton data is the quickest way to begin > playing with what once was MARC data as RDF? To my knowledge Stefano > Mazzocchi simply created an RDF vocabulary that mirrors the MODS XML > Schema, but I haven't looked at it in a while. > > Another thing worth checking out might be Rob Styles work [6] with > other people at Talis at converting MARC with full fidelity to RDF. > Perhaps he has some tools (or data) at his disposal? Rob you are on > here right? > > I'd be willing to lend a hand with some of this if necessary, so just > let me know if you think I can help. > > //Ed > > [1] http://www.indexdata.com/yaz/doc/yaz-marcdump.tkl > [2] http://www.loc.gov/standards/marcxml/xslt/MARC21slim2RDFDC.xsl > [3] http://www.loc.gov/standards/mods/v3/MARC21slim2MODS3.xsl > [4] http://simile.mit.edu/wiki/MARC/MODS_RDFizer > [5] http://simile.mit.edu/wiki/Dataset:_Barton > [6] > http://events.linkeddata.org/ldow2008/papers/02-styles-ayers-semantic-marc.pdf -- Alistair Miles Senior Computing Officer Image Bioinformatics Research Group Department of Zoology The Tinbergen Building University of Oxford South Parks Road Oxford OX1 3PS United Kingdom Web: http://purl.org/net/aliman Email: alistair.mi...@zoo.ox.ac.uk Tel: +44 (0)1865 281993