Fair point. Just instinct on my part that putting it in a triple is a bit ugly :)
It probably doesn't make any difference, although I don't think storing in a triple ensures that it sticks to the object (you could store the triple anywhere as well) Owen Owen Stephens Owen Stephens Consulting Web: http://www.ostephens.com Email: [email protected] Telephone: 0121 288 6936 On 6 Dec 2011, at 22:43, Fleming, Declan wrote: > Hi - point at it where? We could point back to the library catalog that we > harvested in the MARC to MODS to RDF process, but what if that goes away? > Why not write ourselves a 1K insurance policy that sticks with the object for > its life? > > D > > -----Original Message----- > From: Code for Libraries [mailto:[email protected]] On Behalf Of Owen > Stephens > Sent: Tuesday, December 06, 2011 8:06 AM > To: [email protected] > Subject: Re: [CODE4LIB] Models of MARC in RDF > > I'd suggest that rather than shove it in a triple it might be better to point > at alternative representations, including MARC if desirable (keep meaning to > blog some thoughts about progressively enhanced metadata...) > > Owen > > Owen Stephens > Owen Stephens Consulting > Web: http://www.ostephens.com > Email: [email protected] > Telephone: 0121 288 6936 > > On 6 Dec 2011, at 15:44, Karen Coyle wrote: > >> Quoting "Fleming, Declan" <[email protected]>: >> >>> Hi - I'll note that the mapping decisions were made by our metadata >>> services (then Cataloging) group, not by the tech folks making it all >>> work, though we were all involved in the discussions. One idea that >>> came up was to do a, perhaps, lossy translation, but also stuff one >>> triple with a text dump of the whole MARC record just in case we >>> needed to grab some other element out we might need. We didn't do >>> that, but I still like the idea. Ok, it was my idea. ;) >> >> I like that idea! Now that "disk space" is no longer an issue, it makes good >> sense to keep around the "original state" of any data that you transform, >> just in case you change your mind. I hadn't thought about incorporating the >> entire MARC record string in the transformation, but as I recall the average >> size of a MARC record is somewhere around 1K, which really isn't all that >> much by today's standards. >> >> (As an old-timer, I remember running the entire Univ. of California >> union catalog on 35 megabytes, something that would now be considered >> a smallish email attachment.) >> >> kc >> >>> >>> D >>> >>> -----Original Message----- >>> From: Code for Libraries [mailto:[email protected]] On Behalf >>> Of Esme Cowles >>> Sent: Monday, December 05, 2011 11:22 AM >>> To: [email protected] >>> Subject: Re: [CODE4LIB] Models of MARC in RDF >>> >>> I looked into this a little more closely, and it turns out it's a little >>> more complicated than I remembered. We built support for transforming to >>> MODS using the MODS21slim2MODS.xsl stylesheet, but don't use that. >>> Instead, we use custom Java code to do the mapping. >>> >>> I don't have a lot of public examples, but there's at least one public >>> object which you can view the MARC from our OPAC: >>> >>> http://roger.ucsd.edu/search/.b4827884/.b4827884/1,1,1,B/detlmarc~123 >>> 4567&FF=&1,0, >>> >>> The public display in our digital collections site: >>> >>> http://libraries.ucsd.edu/ark:/20775/bb0648473d >>> >>> The RDF for the MODS looks like: >>> >>> <mods:classification rdf:parseType="Resource"> >>> <mods:authority>local</mods:authority> >>> <rdf:value>FVLP 222-1</rdf:value> >>> </mods:classification> >>> <mods:identifier rdf:parseType="Resource"> >>> <mods:type>ARK</mods:type> >>> >>> <rdf:value>http://libraries.ucsd.edu/ark:/20775/bb0648473d</rdf:value> >>> </mods:identifier> >>> <mods:name rdf:parseType="Resource"> >>> <mods:namePart>Brown, Victor W</mods:namePart> >>> <mods:type>personal</mods:type> >>> </mods:name> >>> <mods:name rdf:parseType="Resource"> >>> <mods:namePart>Amateur Film Club of San Diego</mods:namePart> >>> <mods:type>corporate</mods:type> >>> </mods:name> >>> <mods:originInfo rdf:parseType="Resource"> >>> <mods:dateCreated>[196-]</mods:dateCreated> >>> </mods:originInfo> >>> <mods:originInfo rdf:parseType="Resource"> >>> <mods:dateIssued>2005</mods:dateIssued> >>> <mods:publisher>Film and Video Library, University of California, >>> San Diego, La Jolla, CA 92093-0175 >>> http://orpheus.ucsd.edu/fvl/FVLPAGE.HTM</mods:publisher> >>> </mods:originInfo> >>> <mods:physicalDescription rdf:parseType="Resource"> >>> <mods:digitalOrigin>reformatted digital</mods:digitalOrigin> >>> <mods:note>16mm; 1 film reel (25 min.) :; sd., col. ;</mods:note> >>> </mods:physicalDescription> >>> <mods:subject rdf:parseType="Resource"> >>> <mods:authority>lcsh</mods:authority> >>> <mods:topic>Ranching</mods:topic> >>> </mods:subject> >>> >>> etc. >>> >>> >>> There is definitely some loss in the conversion process -- I don't know >>> enough about the MARC leader and control fields to know if they are >>> captured in the MODS and/or RDF in some way. But there are quite a few >>> local and note fields that aren't present in the RDF. Other fields (e.g. >>> 300 and 505) are mapped to MODS, but not displayed in our access system >>> (though they are indexed for searching). >>> >>> I agree it's hard to quantify lossy-ness. Counting fields or characters >>> would be the most objective, but has obvious problems with control >>> characters sometimes containing a lot of information, and then the relative >>> importance of different fields to the overall description. There are other >>> issues too -- some fields in this record weren't migrated because they >>> duplicated collection-wide values, which are formulated slightly >>> differently from the MARC record. Some fields weren't migrated because >>> they concern the physical object, and therefore don't really apply to the >>> digital object. So that really seems like a morass to me. >>> >>> -Esme >>> -- >>> Esme Cowles <[email protected]> >>> >>> "Necessity is the plea for every infringement of human freedom. It is >>> the argument of tyrants; it is the creed of slaves." -- William >>> Pitt, 1783 >>> >>> On 12/3/2011, at 10:35 AM, Karen Coyle wrote: >>> >>>> Esme, let me second Owen's enthusiasm for more detail if you can >>>> supply it. I think we also need to start putting these efforts along >>>> a "loss" continuum - MODS is already lossy vis-a-vis MARC, and my >>>> guess is that some of the other MARC->RDF transforms don't include >>>> all of the warts and wrinkles of MARC. LC's new bibliographic >>>> framework document sets as a goal to bring along ALL of MARC (a >>>> decision that I think isn't obvious, as we have already discussed >>>> here). If we say we are going from MARC to RDF, how much is actually >>>> captured in the transformed data set? (Yes, that's going to be hard >>>> to quantify.) >>>> >>>> kc >>>> >>>> Quoting Esme Cowles <[email protected]>: >>>> >>>>> Owen- >>>>> >>>>> Another strategy for capturing MARC data in RDF is to convert it to MODS >>>>> (we do this using the LoC MARC to MODS stylesheet: >>>>> http://www.loc.gov/standards/marcxml/xslt/MARC21slim2MODS.xsl). From >>>>> there, it's pretty easy to incorporate into RDF. There are some issues >>>>> to be aware of, such as how to map the MODS XML names to predicates and >>>>> how to handle elements that can appear in multiple places in the >>>>> hierarchy. >>>>> >>>>> -Esme >>>>> -- >>>>> Esme Cowles <[email protected]> >>>>> >>>>> "Necessity is the plea for every infringement of human freedom. It >>>>> is the argument of tyrants; it is the creed of slaves." -- William >>>>> Pitt, >>>>> 1783 >>>>> >>>>> On 11/28/2011, at 8:25 AM, Owen Stephens wrote: >>>>> >>>>>> It would be great to start collecting transforms together - just a >>>>>> quick brain dump of some I'm aware of >>>>>> >>>>>> MARC21 transformations >>>>>> Cambridge University Library - http://data.lib.cam.ac.uk - >>>>>> transformation made available (in code) from same site Open >>>>>> University - http://data.open.ac.uk - specific transform for >>>>>> materials related to teaching, code available at >>>>>> http://code.google.com/p/luceroproject/source/browse/trunk%20lucer >>>>>> op >>>>>> roject/OULinkedData/src/uk/ac/open/kmi/lucero/rdfextractor/RDFExtr >>>>>> ac tor.java (MARC transform is in libraryRDFExtraction method) >>>>>> COPAC - small set of records from the COPAC Union catalogue - data >>>>>> and transform not yet published Podes Projekt - LinkedAuthors - >>>>>> documentation at >>>>>> http://bibpode.no/linkedauthors/doc/Pode-LinkedAuthors-Documentati >>>>>> on .pdf - 2 stage transformation firstly from MARC to FRBRized >>>>>> version of data, then from FRBRized data to RDF. These linked from >>>>>> documentation Podes Project - LinkedNonFiction - documentation at >>>>>> http://bibpode.no/linkednonfiction/doc/Pode-LinkedNonFiction-Docum >>>>>> en tation.pdf - MARC data transformed using xslt >>>>>> https://github.com/pode/LinkedNonFiction/blob/master/marcslim2n3.x >>>>>> sl >>>>>> >>>>>> British Library British National Bibliography - >>>>>> http://www.bl.uk/bibliographic/datafree.html - data model >>>>>> documented, but no code available Libris.se - some notes in >>>>>> various presentations/blogposts (e.g. >>>>>> http://dc2008.de/wp-content/uploads/2008/09/malmsten.pdf) but >>>>>> can't find explicit transformation Hungarian National library - >>>>>> http://thedatahub.org/dataset/hungarian-national-library-catalog >>>>>> and http://nektar.oszk.hu/wiki/Semantic_web#Implementation - some >>>>>> information on ontologies used but no code or explicit >>>>>> transformation (not 100% sure this is from MARC) Talis - >>>>>> implemented in several live catalogues including >>>>>> http://catalogue.library.manchester.ac.uk/ - no documentation or >>>>>> code afaik although some notes in >>>>>> >>>>>> MAB transformation >>>>>> HBZ - some of the transformation documented at >>>>>> https://wiki1.hbz-nrw.de/display/SEM/Converting+the+Open+Data+from+the+hbz+to+BIBO, >>>>>> don't think any code published? >>>>>> >>>>>> Would be really helpful if more projects published their >>>>>> transformations (or someone told me where to look!) >>>>>> >>>>>> Owen >>>>>> >>>>>> Owen Stephens >>>>>> Owen Stephens Consulting >>>>>> Web: http://www.ostephens.com >>>>>> Email: [email protected] >>>>>> Telephone: 0121 288 6936 >>>>>> >>>>>> On 26 Nov 2011, at 15:58, Karen Coyle wrote: >>>>>> >>>>>>> A few of the code4lib talk proposals mention projects that have or will >>>>>>> transform MARC records into RDF. If any of you have documentation >>>>>>> and/or examples of this, I would be very interested to see them, even >>>>>>> if they are "under construction." >>>>>>> >>>>>>> Thanks, >>>>>>> kc >>>>>>> >>>>>>> -- >>>>>>> Karen Coyle >>>>>>> [email protected] http://kcoyle.net >>>>>>> ph: 1-510-540-7596 >>>>>>> m: 1-510-435-8234 >>>>>>> skype: kcoylenet >>>>> >>>> >>>> >>>> >>>> -- >>>> Karen Coyle >>>> [email protected] http://kcoyle.net >>>> ph: 1-510-540-7596 >>>> m: 1-510-435-8234 >>>> skype: kcoylenet >>> >> >> >> >> -- >> Karen Coyle >> [email protected] http://kcoyle.net >> ph: 1-510-540-7596 >> m: 1-510-435-8234 >> skype: kcoylenet
