Re: [CODE4LIB] transforming marc to rdf
Hi Eric, you seem to have missed the Catmandu tutorial at SWIB13. Luckily there is a basic tutorial and a demo online: http://librecat.org/ The demo happens to be about transforming MARC to RDF using the Catmandu Perl framework. It gives you full flexibility by separating the importer from the exporter and providing a domain specific language for “fixing” the data in between. Catmandu also has easy to use wrappers for popular search engines and databases (both SQL and NoSQL), making it a complete ETL (extract, transform, load) toolkit. Disclosure: I am a Catmandu contributor. It's free and open source software. Cheers, Christian On Wed, Dec 04, 2013 at 09:59:46PM -0500, Eric Lease Morgan wrote: Converting MARC to RDF has been more problematic. There are various tools enabling me to convert my original MARC into MARCXML and/or MODS. After that I can reportably use a few tools to convert to RDF: * MARC21slim2RDFDC.xsl [3] - functions, but even for my tastes the resulting RDF is too vanilla. [4] * modsrdf.xsl [5] - optimal, but when I use my transformation engine (Saxon), I do not get XML but rather plain text * BIBFRAME Tools [6] - sports nice ontologies, but the online tools won’t scale for large operations -- Christian Pietsch · http://www.ub.uni-bielefeld.de/~cpietsch/ LibTec · Library Technology and Knowledge Management Bielefeld University Library, Bielefeld, Germany
Re: [CODE4LIB] transforming marc to rdf [comet]
On Dec 4, 2013, at 10:29 PM, Corey A Harper corey.har...@nyu.edu wrote: Have you had a look at Ed Chamberlain's work on COMET: https://github.com/edchamberlain/COMET It's been a while since I've run this, but if I remember correctly, it was fairly easy-to-use. Thank you for the pointer. I downloaded the COMET “suite”, and got good output, but only after I enhanced/tweaked the source code to require the Perl Encode module: ./marc2rdf_batch.pl pamphlets.marc The result was a huge set of triples saved as RDF/Turtle. I then used a Java archive (RDF2RDF [0]) to painlessly convert the Turtle to RDF/XML. The process worked. It was “easy” more me, sort of, but it employes quite a number of sophisticated and underlying technologies. I could integrate everything into a whole, but… On to explore other options. [1] RDF2RDF - http://www.l3s.de/~minack/rdf2rdf/ — Sleepless In South Bend
Re: [CODE4LIB] transforming marc to rdf [mods_rdfizer]
On Dec 4, 2013, at 10:29 PM, Corey A Harper corey.har...@nyu.edu wrote: Also, though much older, I seem to remember the Simile MARC RDFizer being a pretty straightforward one to run: http://simile.mit.edu/wiki/MARC/MODS_RDFizer MODS aficionados will point to some problems with some of it's choices for representing that data, but still a good starting point (IMO). Again, thanks for the pointer. I downloaded MODS_RDFizer and got it to run, but it was a good thing that I already had mvn installed. The output did created an RDF/XML file, and I concur, the implemented ontology is “interesting”. The distribution include a possibly cool stylesheet — mods2rdf.xslt. Maybe I can use this. Hmm… —Still Sleepless
Re: [CODE4LIB] transforming marc to rdf [mods_rdfizer]
On Dec 5, 2013, at 6:54 AM, Eric Lease Morgan emor...@nd.edu wrote: http://simile.mit.edu/wiki/MARC/MODS_RDFizer ...The distribution includes a possibly cool stylesheet — mods2rdf.xslt. Ah ha! The MODS_RDFizer’s mods2rdf.xslt file functioned very well against one of my MODS files: $ xsltproc mods2rdf.xslt pamphlets.mods pamphlets.rdf Mods2rdf.xslt could very easily be configured at the beginning of the file to suit the needs of a local “cultural heritage institution”. I like the use of XSL to create a serialized RDF as opposed to the use of an application because less infrastructure is needed to make things happen. — Too Much Coffee?
Re: [CODE4LIB] transforming marc to rdf [catmandu]
On Dec 5, 2013, at 3:07 AM, Christian Pietsch chr.pietsch+web4...@googlemail.com wrote: you seem to have missed the Catmandu tutorial at SWIB13. Luckily there is a basic tutorial and a demo online: http://librecat.org/ I did attend SWIB13, and I really wanted to go to the Catmandu workshop, but since I’m a Perl “affectionato I figured I could play with it later on my own. Instead I attended the workshop on provenance. (Travelogue is pending.) In any event, playing with the Catmandu demo was insightful. [1] I see and understand the workflow: import data, fix it, store it, fix it, export it. I see how it is designed to use many import and export formats. The key to the software seems to be two-fold: 1) the ability to read and write Perl programs, and 2) understanding Catmandu’s “fix” language. There are great possibilities here for us Perl folks. Thank you for re-brining it to my attention. [1] demo - http://demo.librecat.org — Eric Lease Morgan
Re: [CODE4LIB] transforming marc to rdf
Eric, I'm having a hard time figuring out exactly what you're hoping to get. Going from MARC to RDF was my great white whale for years while Talis' main business interests involved both of those (although not archival collections). Anything that will remodel MARC to (decent) RDF is going be: - Non-trivial to install - Non-trivial to use - Slow - Require massive amounts of memory/disk space Choose any two. Frankly, I don't see how you can generate RDF that anybody would want to use from XSLT: where would your URIs come from? What, exactly, are you modeling? I guess, to me, it would be a lot more helpful for you to take an archival MARC record, and, by hand, build an RDF graph from it, then figure out your mappings. I just don't see any way to make it easy-to-use, at least, not until you have an agreed upon model to map to. -Ross. On Thu, Dec 5, 2013 at 3:07 AM, Christian Pietsch chr.pietsch+web4...@googlemail.com wrote: Hi Eric, you seem to have missed the Catmandu tutorial at SWIB13. Luckily there is a basic tutorial and a demo online: http://librecat.org/ The demo happens to be about transforming MARC to RDF using the Catmandu Perl framework. It gives you full flexibility by separating the importer from the exporter and providing a domain specific language for “fixing” the data in between. Catmandu also has easy to use wrappers for popular search engines and databases (both SQL and NoSQL), making it a complete ETL (extract, transform, load) toolkit. Disclosure: I am a Catmandu contributor. It's free and open source software. Cheers, Christian On Wed, Dec 04, 2013 at 09:59:46PM -0500, Eric Lease Morgan wrote: Converting MARC to RDF has been more problematic. There are various tools enabling me to convert my original MARC into MARCXML and/or MODS. After that I can reportably use a few tools to convert to RDF: * MARC21slim2RDFDC.xsl [3] - functions, but even for my tastes the resulting RDF is too vanilla. [4] * modsrdf.xsl [5] - optimal, but when I use my transformation engine (Saxon), I do not get XML but rather plain text * BIBFRAME Tools [6] - sports nice ontologies, but the online tools won’t scale for large operations -- Christian Pietsch · http://www.ub.uni-bielefeld.de/~cpietsch/ LibTec · Library Technology and Knowledge Management Bielefeld University Library, Bielefeld, Germany
Re: [CODE4LIB] Discovery layer for Primo
Everything I've heard from Ex Libris is that Alma is discovery layer agnostic, though they understandably want you to use Primo since it's their product. Perhaps the differentiation is that they won't host third-party discovery layers at this time? If you wanted to use Blacklight/VuFind/etc. it should work but would have to be self-hosted.
Re: [CODE4LIB] transforming marc to rdf [to batch or not to batch]
When exposing sets of MARC records as linked data, do you think it is better to expose them in batch (collection) files or as individual RDF serializations? To bastardize the Bard — “To batch or not to batch? That is the question.” Suppose I am a medium-sized academic research library. Suppose my collection is comprised of approximately 3.5 million bibliographic records. Suppose I want to expose those records via linked data. Suppose further that this will be done by “simply” making RDF serialization files (XML, Turtle, etc.) accessible via an HTTP filesystem. No scripts. No programs. No triple stores. Just files on an HTTP file system coupled with content negotiation. Given these assumptions, would you: 1. create batches of MARC records, convert them to MARCXML and then to RDF, and save these files to disc, or 2. parse the batches of MARC record sets into individual records, convert them into MARCXML and then RDF, and save these files to disc Option #1 would require heavy lifting against large files, but the number of resulting files to save to disc would be relatively few — reasonably managed in a single directory on disc. On the other hand, individual URIs pointing to individual serializations would not be accessible. They would only be accessible by retrieving the collection file in which they reside. Moreover, a mapping of individual URIs to collection files would need to be maintained. Option #2 would be easier on the computing resources because processing little files is generally easier than processing bigger ones. On the other hand, the number of files generated by this option is not easily be managed without the use of a sophisticated directory structure. (It is not feasible to put 3.5 million files in a single directory.) But I would still need to create a mapping from URI to directory. In either case, I would probably create a bunch of site map files denoting the locations of my serializations — YAP (Yet Another Mapping). I’m leaning towards Option #2 because individual URIs could be resolved more easily with “simple” content negotiation. (Given my particular use case — archival MARC records — I don’t think I’d really have more than a few thousand items, but I’m asking the question on a large scale anyway.) — Eric Morgan
Re: [CODE4LIB] book cover api
On a second thought, IIIF won't work for my situation either, though it offers much more flexible manipulation on an individual base. My situation is: I have a loop to list many books, wanting a book cover image for each book. Kelly -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Keith Jenkins Sent: 2013年12月4日 13:50 To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] book cover api So, any bets on which book cover image provider will be the first to implement IIIF? http://www-sul.stanford.edu/iiif/image-api/1.1/ Keith On Wed, Dec 4, 2013 at 2:41 PM, Karen Coyle li...@kcoyle.net wrote: Open Library book covers come in S, M and L - https://openlibrary.org/dev/docs/api/covers Of course, if what you want isn't exactly one of those... kc On 12/4/13 9:34 AM, Kaile Zhu wrote: A while ago, we had a discussion about book cover APIs. I tried some of those mentioned and found they are working to some degree, but none of them would offer the size I want. The flexibility of the size is just not there. The size I am looking for is like this: http://img1.imagesbn.com/p/9780316227940_p0_v2_s114x166.JPG Anybody has found a way of implementing book cover api to your specifications successfully and is willing to share that with me? Off-line if you want. Much appreciation. Thanks. Kelly Zhu 405-974-5957 kz...@uco.edu **Bronze+Blue=Green** The University of Central Oklahoma is Bronze, Blue, and Green! Please print this e-mail only if absolutely necessary! **CONFIDENTIALITY** This e-mail (including any attachments) may contain confidential, proprietary and privileged information. Any unauthorized disclosure or use of this information is prohibited. -- Karen Coyle kco...@kcoyle.net http://kcoyle.net m: 1-510-435-8234 skype: kcoylenet **Bronze+Blue=Green** The University of Central Oklahoma is Bronze, Blue, and Green! Please print this e-mail only if absolutely necessary! **CONFIDENTIALITY** This e-mail (including any attachments) may contain confidential, proprietary and privileged information. Any unauthorized disclosure or use of this information is prohibited.
Re: [CODE4LIB] transforming marc to rdf
On Dec 5, 2013, at 8:55 AM, Ross Singer rossfsin...@gmail.com wrote: Eric, I'm having a hard time figuring out exactly what you're hoping to get. Going from MARC to RDF was my great white whale for years while Talis' main business interests involved both of those (although not archival collections). Anything that will remodel MARC to (decent) RDF is going be: - Non-trivial to install - Non-trivial to use - Slow - Require massive amounts of memory/disk space Choose any two. Frankly, I don't see how you can generate RDF that anybody would want to use from XSLT: where would your URIs come from? What, exactly, are you modeling? I guess, to me, it would be a lot more helpful for you to take an archival MARC record, and, by hand, build an RDF graph from it, then figure out your mappings. I just don't see any way to make it easy-to-use, at least, not until you have an agreed upon model to map to. Ross, good questions. I’m hoping to articulate and implement a simple and functional method for exposing EAD and MARC metadata as linked data. “Simple and functional” are the operative words; I’m not necessarily looking for “fast”, “best” nor “perfect”. I am trying to articulate something that requires the least amount of infrastructure and technical expertise. Reasonable RDF through XSLT? Good point. I like the use of XSLT because it does not require very much technical infrastructure — just ubiquitous XSLT processors like Saxon or xsltproc. I have identified two or three stylesheets transforming MARCXML/MODS into RDF/XML. 1. The first comes from the Library of Congress and uses Dublin Core as its ontology, but the resulting RDF has no URIs and the Dublin Core is not good enough, even for my tastes. [1] 2. The second also comes from the Library of Congress, and it uses a richer, more standard ontology, but I can’t get it to work. All I get as output is a plain text file. I must be doing something wrong. [2] 3. The found the third stylesheet buried the MARC/MODS RDFizer. The sheet uses XSLT 1.0 which is good for my xsltproc-like tools. I get output, which is better than Sheet #2. The ontology is a bit MIT-specific, but it is one heck of a lot richer than Sheet #1. Moreover, the RDF includes URIs. [3, 4] In none of these cases will the ontology be best nor perfect, but for right now I don’t care. The ontology is good enough. Heck, the ontologies don’t even come close to the ontology I get when transforming my EAD to RDF using the Archives Hub stylesheet. [5] I just want to expose the content as linked data. Somebody else — the community — can come behind to improve the stylesheets and their ontologies. Where will I get the URIs from? I will get them by combining some sort of unique code (like an OCLC symbol) or namespace with the value of the MARC records' 001 fields. Here is an elaboration of my original recipe for making MARC metadata accessible via linked data: 1. obtain a set of MARC records 2. parse out a record from the set 3. convert it to MARCXML 4. transform MARCXML into HTML 5. transform MARCXML into RDF (probably through MODS first) 6. save HTML and RDF to disc 7. update a mapping file / data structure denoting where things are located 7. go to Step #2 for each record in the set 8. use the mapping to create a set of site map files 9. use the mapping to support HTTP content negotiation 10. create an index.html file allowing humans to browse the collection as well as point robots to the RDF 11. for extra credit, import all the RDF into a triple store and provide access via SPARQL I think I can do the same thing with EAD files. Moreover, I think I an do this with a small number of (Perl) scripts easily readable by others enabling them to implement the scripts in a programming language of their choice. Once I get this far metadata experts can improve the ontologies, and computer scientists can improve the infrastructure. In the meantime the linked data can be harvested for the good purposes link data was articulated. It is in my head. It really is. All I need is the time, focus, and energy to implement it. On my mark. Get set. Go. [1] MARC21slim2RDFDC.xsl - http://www.loc.gov/standards/marcxml/xslt/MARC21slim2RDFDC.xsl [2] modsrdf.xsl - http://www.loc.gov/standards/mods/modsrdf/xsl-files/modsrdf.xsl [3] mods2rdf.xslt - http://infomotions.com/tmp/mods2rdf.xslt [4] MARC/MODS RDFizer - http://simile.mit.edu/wiki/MARC/MODS_RDFizer [5] ead2rdf.xsl - http://data.archiveshub.ac.uk/xslt/ead2rdf.xsl — Eric Lease Morgan
Re: [CODE4LIB] transforming marc to rdf
On Thu, Dec 5, 2013 at 11:11 AM, Eric Lease Morgan emor...@nd.edu wrote: I’m hoping to articulate and implement a simple and functional method for exposing EAD and MARC metadata as linked data. Isn't the point of this to expose archival description as linked data? What about description maintained in applications like a collection management system, say, ArchivesSpace or Archivists' Toolkit? Mark -- Mark A. Matienzo m...@matienzo.org Director of Technology, Digital Public Library of America
Re: [CODE4LIB] transforming marc to rdf
On Dec 5, 2013, at 11:17 AM, Mark A. Matienzo mark.matie...@gmail.com wrote: I’m hoping to articulate and implement a simple and functional method for exposing EAD and MARC metadata as linked data. Isn't the point of this to expose archival description as linked data? What about description maintained in applications like a collection management system, say, ArchivesSpace or Archivists' Toolkit? Good question! At the very least, these applications (ArchivesSpace, Archivists’ Toolkit, etc.) can regularly and systematically export their data as EAD, and the EAD can be made available as linked data. It would be ideal if the applications where to natively make their metadata available as linked data, but exporting their content as EAD is a functional stopgap solution. —Eric Morgan
[CODE4LIB] Job: Database Administrator/IT Specialist, BDLSS at University of Oxford
Database Administrator/IT Specialist, BDLSS University of Oxford Oxford We are seeking an experienced Database Administrator, to join our established Bodleian Digital Library Systems and Support team. The team provides support and development services for the libraries' core service applications including the integrated library system, resource discovery platforms and admissions system. You will be responsible for the management of key library system databases, principally the Bodleian Libraries Admissions card service, and will additionally be required to provide support to the library catalogue and resource discovery services. You will have significant experience in managing SQL Server 2008 applications, be familiar with use of the Unix/Linux command line and have scripting experience. Some knowledge of web technologies including HTML and CSS is essential, as are good problem-solving, analytical and communication skills. Applications are to be made online. To apply for this role and for further details, including a job description and selection criteria, please click on the link below: [https://www.recruit.ox.ac.uk/pls/hrisliverecruit/erq_jobspec_version_4.jobspe c?p_id=110819](https://www.recruit.ox.ac.uk/pls/hrisliverecruit/erq_jobspec_ve rsion_4.jobspec?p_id=110819) You will be required to upload a supporting statement as part of your online application. Only applications received before midday on Monday 6th January 2014 can be considered. Interviews are anticipated to take place during w/c 13 January 2014. Brought to you by code4lib jobs: http://jobs.code4lib.org/job/11057/
Re: [CODE4LIB] transforming marc to rdf
On Thu, Dec 5, 2013 at 11:26 AM, Eric Lease Morgan emor...@nd.edu wrote: Good question! At the very least, these applications (ArchivesSpace, Archivists’ Toolkit, etc.) can regularly and systematically export their data as EAD, and the EAD can be made available as linked data. It would be ideal if the applications where to natively make their metadata available as linked data, but exporting their content as EAD is a functional stopgap solution. —Eric Morgan Wouldn't it make more sense, especially with a system like ArchivesSpace, which provides a backend HTTP API and a public UI, to publish linked data directly instead of adding yet another stopgap? Mark -- Mark A. Matienzo m...@matienzo.org Director of Technology, Digital Public Library of America
Re: [CODE4LIB] transforming marc to rdf
On Dec 5, 2013, at 11:33 AM, Mark A. Matienzo mark.matie...@gmail.com wrote: At the very least, these applications (ArchivesSpace, Archivists’ Toolkit, etc.) can regularly and systematically export their data as EAD, and the EAD can be made available as linked data. Wouldn't it make more sense, especially with a system like ArchivesSpace, which provides a backend HTTP API and a public UI, to publish linked data directly instead of adding yet another stopgap? Publishing via a content management system would make more sense, if: 1. the archivist uses the specific content management system 2. the content management system supported the functionality “There is more than one way to skin a cat.” There are advantages and disadvantages to every software solution. — Eric
Re: [CODE4LIB] transforming marc to rdf
With apologies to Eric to others from the LiAM project, I feel like I want to jump in here with a little more context. Eric, or Aaron, or Anne, please feel free to correct any of what I say below. I agree with the points made and concerns raised by both Ross Mark -- most significantly, that a sustainable infrastructure for linked archival metadata is not going to come from an XSLT stylesheet. However, I also see tremendous value in what Eric is putting together here. The prospectus for the LiAM project, which is the context for Eric's questions, is about developing guiding principles and educational tools for the archival community to better understand, prepare for, and contribute to the kind of infrastructure both Ross Mark are talking about: http://sites.tufts.edu/liam/deliverables/prospectus-for-linked-archival-metadata-a-guidebook/ While I agree that converting legacy data in EAD MARC formats to RDF is not the approach this work will take in the future, I also believe that these are formats that the archival community is very familiar with, and XSLT is a tool that many archivists work with regularly. A workflow for that community to experiment is a laudable goal. In short, I think we need approaches that illustrate the potential of linked data in archives, to highlight some of the shortcomings in our current metadata management frameworks, to help archivists be in a position to get their metadata ready for what Mark is describing in the context of ArchivesSpace (e.g. please use id attributes in c tags!!), and to have a more complete picture of why doing so is of some value. Sorry for the long message, and I hope that the context is helpful. Regards, -Corey On Thu, Dec 5, 2013 at 11:33 AM, Mark A. Matienzo mark.matie...@gmail.comwrote: On Thu, Dec 5, 2013 at 11:26 AM, Eric Lease Morgan emor...@nd.edu wrote: Good question! At the very least, these applications (ArchivesSpace, Archivists’ Toolkit, etc.) can regularly and systematically export their data as EAD, and the EAD can be made available as linked data. It would be ideal if the applications where to natively make their metadata available as linked data, but exporting their content as EAD is a functional stopgap solution. —Eric Morgan Wouldn't it make more sense, especially with a system like ArchivesSpace, which provides a backend HTTP API and a public UI, to publish linked data directly instead of adding yet another stopgap? Mark -- Mark A. Matienzo m...@matienzo.org Director of Technology, Digital Public Library of America -- Corey A Harper Metadata Services Librarian New York University Libraries 20 Cooper Square, 3rd Floor New York, NY 10003-7112 212.998.2479 corey.har...@nyu.edu
Re: [CODE4LIB] transforming marc to rdf
I've been following this conversation as a non-coder. I'm really interested in getting a better understanding of linked data and how to use existing metadata for proof of concept linked data outputs. So, I totally think Eric's approaches are valuable and would be something I would use. I also understand there are many ways to do something better and more in the flow. So, just encouraging you all to keep posting thoughts in both directions! Best, Lisa - Elizabeth Lisa McAulay Librarian for Digital Collection Development UCLA Digital Library Program http://digital.library.ucla.edu/ email: emcaulay [at] library.ucla.edu From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Eric Lease Morgan [emor...@nd.edu] Sent: Thursday, December 05, 2013 8:57 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] transforming marc to rdf On Dec 5, 2013, at 11:33 AM, Mark A. Matienzo mark.matie...@gmail.com wrote: At the very least, these applications (ArchivesSpace, Archivists’ Toolkit, etc.) can regularly and systematically export their data as EAD, and the EAD can be made available as linked data. Wouldn't it make more sense, especially with a system like ArchivesSpace, which provides a backend HTTP API and a public UI, to publish linked data directly instead of adding yet another stopgap? Publishing via a content management system would make more sense, if: 1. the archivist uses the specific content management system 2. the content management system supported the functionality “There is more than one way to skin a cat.” There are advantages and disadvantages to every software solution. — Eric
Re: [CODE4LIB] transforming marc to rdf
On Thu, Dec 5, 2013 at 11:57 AM, Eric Lease Morgan emor...@nd.edu wrote: On Dec 5, 2013, at 11:33 AM, Mark A. Matienzo mark.matie...@gmail.com wrote: Wouldn't it make more sense, especially with a system like ArchivesSpace, which provides a backend HTTP API and a public UI, to publish linked data directly instead of adding yet another stopgap? Publishing via a content management system would make more sense, if: 1. the archivist uses the specific content management system 2. the content management system supported the functionality “There is more than one way to skin a cat.” There are advantages and disadvantages to every software solution. I recognized that not everyone uses a collection management system and instead may author description using EAD or something else directly, but I think we really need to acknowledge the affordance of that kind of software here. I can tell you for certain there are certain aspects of the ArchivesSpace data model that are not serializable in any good way - or at all - using EAD or MARC. Per Corey's message: I have no objection in principle to using XSLT to provide examples of ways to do this transformation (I know lots of people have piles of existing EAD) as long as the resulting data is acknowledged to be less than ideal. EAD is also not a data model, it's a document model for a finding aid. EAD3 will improve this somewhat, but it's still not a representation of a conceptual model of archival entities. My concern about using something like XSLT *specifically* to transform archival description stored in MARC is that the existing stylesheets assume that the MARC description is bibliographic description. Archival description is not bibliographic description. Mark
Re: [CODE4LIB] transforming marc to rdf
On Thu, Dec 5, 2013 at 11:57 AM, Eric Lease Morgan emor...@nd.edu wrote: “There is more than one way to skin a cat.” There are advantages and disadvantages to every software solution. I think what Mark and I are trying to say is that the first step to this solution is not by applying software at existing data, but by trying to figure out the problem you're actually trying to solve. Any linked data future cannot be a simple as a technologist giving some magic tool to archivists and librarians. You still haven't really answered my question about what you're hoping to achieve and who stands to benefit from it. I don't see how assigning a bunch of arbitrary identifiers, properties, and values to a description of a collection of archival materials (especially since you're talking about doing this in XSLT, so your archival collections can't even really be related to /each other/ much less anything else). Who is going to use going to use this data? What are they supposed to do with it? What will libraries and archives get from it? I am certainly not above academic exercises (or without my own), but I absolutely can see *no* beneficial archival linked data created simply by pointing an XSLT at a bunch of EAD and MARCXML and I certainly can't without a clear vision of the model that said XSLT is supposed to generate. The key part here is the data model, and taking a 'software solution'-first approach does nothing to address that. -Ross.
Re: [CODE4LIB] transforming marc to rdf
* BIBFRAME Tools [6] - sports nice ontologies, but the online tools won’t scale for large operations -- The code running the transformation at [6] is available here: https://github.com/lcnetdev/marc2bibframe We've run several million records through it at one time. As with everything, the data needs to be properly prepared and we have a script that processes those millions in smaller (but still sizeable) batches. Yours, Kevin On 12/04/2013 09:59 PM, Eric Lease Morgan wrote: I have to eat some crow, and I hope somebody here can give me some advice for transforming MARC to RDF. I am in the midst of writing a book describing the benefits of linked data for archives. Archival metadata usually comes in two flavors: EAD and MARC. I found a nifty XSL stylesheet from the Archives Hub (that’s in the United Kingdom) transforming EAD to RDF/XML. [1] With a bit of customization I think it could be used quite well for just about anybody with EAD files. I have retained a resulting RDF/XML file online. [2] Converting MARC to RDF has been more problematic. There are various tools enabling me to convert my original MARC into MARCXML and/or MODS. After that I can reportably use a few tools to convert to RDF: * MARC21slim2RDFDC.xsl [3] - functions, but even for my tastes the resulting RDF is too vanilla. [4] * modsrdf.xsl [5] - optimal, but when I use my transformation engine (Saxon), I do not get XML but rather plain text * BIBFRAME Tools [6] - sports nice ontologies, but the online tools won’t scale for large operations In short, I have discovered nothing that is “easy-to-use”. Can you provide me with any other links allowing me to convert MARC to serialized RDF? [1] ead2rdf.xsl - http://data.archiveshub.ac.uk/xslt/ead2rdf.xsl [2] transformed EAD file - http://infomotions.com/tmp/una-ano.rdf [3] MARC21slim2RDFDC.xsl - http://www.loc.gov/standards/marcxml/xslt/MARC21slim2RDFDC.xsl [4] vanilla RDF - http://infomotions.com/tmp/pamphlets.rdf [5] modsrdf.xsl - http://www.loc.gov/standards/mods/modsrdf/xsl-files/modsrdf.xsl [6] BIBFRAME Tools - http://bibframe.org/tools/transform/start — Eric Lease Morgan
Re: [CODE4LIB] transforming marc to rdf
Anything that will remodel MARC to (decent) RDF is going be: - Non-trivial to install - Non-trivial to use - Slow - Require massive amounts of memory/disk space Choose any two. -- I'll second this. Frankly, I don't see how you can generate RDF that anybody would want to use from XSLT: where would your URIs come from? What, exactly, are you modeling? -- Our experience getting to good, URI rich RDF has been basically a two-step process. First there is the raw conversion, which certainly results in verbose blank-node-rich RDF, but we follow that pass with a second one during which blank nodes are replaced with URIs. This has most certainly been the case with BIBFRAME because X number of MARC records may represent varying manifestations of a single work. We don't want X number of instances (manifestations basically) referencing X number of works in the end, but X number of instances referencing 1 work (all other things being equal). We consolidate - for the lack of a better word - X number of works created in the first pass into 1 work (identified by an HTTP URI) and then we make sure X number of instances point to that one work, removing all the duplicate blank-node-identified resources created during the first pass. Granted this consolidation scenario is not scalable without a fairly robust backend solution, but the process at bibframe.org (the code on github) nevertheless does the type of consolidation described above in memory with small MARC collections. Yours, Kevin On 12/05/2013 08:55 AM, Ross Singer wrote: Eric, I'm having a hard time figuring out exactly what you're hoping to get. Going from MARC to RDF was my great white whale for years while Talis' main business interests involved both of those (although not archival collections). Anything that will remodel MARC to (decent) RDF is going be: - Non-trivial to install - Non-trivial to use - Slow - Require massive amounts of memory/disk space Choose any two. -- Frankly, I don't see how you can generate RDF that anybody would want to use from XSLT: where would your URIs come from? What, exactly, are you modeling? I guess, to me, it would be a lot more helpful for you to take an archival MARC record, and, by hand, build an RDF graph from it, then figure out your mappings. I just don't see any way to make it easy-to-use, at least, not until you have an agreed upon model to map to. -Ross. On Thu, Dec 5, 2013 at 3:07 AM, Christian Pietsch chr.pietsch+web4...@googlemail.com wrote: Hi Eric, you seem to have missed the Catmandu tutorial at SWIB13. Luckily there is a basic tutorial and a demo online: http://librecat.org/ The demo happens to be about transforming MARC to RDF using the Catmandu Perl framework. It gives you full flexibility by separating the importer from the exporter and providing a domain specific language for “fixing” the data in between. Catmandu also has easy to use wrappers for popular search engines and databases (both SQL and NoSQL), making it a complete ETL (extract, transform, load) toolkit. Disclosure: I am a Catmandu contributor. It's free and open source software. Cheers, Christian On Wed, Dec 04, 2013 at 09:59:46PM -0500, Eric Lease Morgan wrote: Converting MARC to RDF has been more problematic. There are various tools enabling me to convert my original MARC into MARCXML and/or MODS. After that I can reportably use a few tools to convert to RDF: * MARC21slim2RDFDC.xsl [3] - functions, but even for my tastes the resulting RDF is too vanilla. [4] * modsrdf.xsl [5] - optimal, but when I use my transformation engine (Saxon), I do not get XML but rather plain text * BIBFRAME Tools [6] - sports nice ontologies, but the online tools won’t scale for large operations -- Christian Pietsch · http://www.ub.uni-bielefeld.de/~cpietsch/ LibTec · Library Technology and Knowledge Management Bielefeld University Library, Bielefeld, Germany
[CODE4LIB] coder needed: JavaScript / Google Maps API V.3 Programming project
know any folks w/ Google Maps API skills looking for an odd job? pls. fwd thanks! =Derek *JavaScript / Google Maps API V.3 Programming Job : Wolfsonian – FIU (www.wolfsonian.org http://www.wolfsonian.org)* We are putting together a site to provide public access to the high-resolution digital image of a painting called the Menneske Pyramide [Human Pyramid], by Harald Rudyard Engman, created in Copenhagen, Denmark in 1941. http://www.wolfsonian.org/explore/collections/menneske-pyramide-human-pyramid This painting (which is part of our permanent collection) is full of socio-political messages relating to Denmark's unfortunate involvement in WWII. Many of the figures depicted in the painting are caricatures or likenesses of figures from Danish history and mythology as well as current political and cultural figures from the early part of the 20th century. In the past some curators and researchers have identified some of these figures and we are hoping to utilize the knowledge of the crowd to fill in some more of the blanks. To this end we have created a prototype site that uses the custom map feature of the Google Maps API v.3 which allows most web browsers quick and smooth zooming and panning of this near-gigapixel image. Rudimentary coding has placed some permanent map-pin markers with pop-up info boxes containing text annotations for the figures we have already identified. What we need from you: We want the public version to allow people to contribute annotations by dragging a new marker-pin onto the image and then entering text in that marker’s info box – this annotation would be submitted to a moderation queue and if approved would show up on the public version for all to see. How this is managed is up to you as long as it’s easily sustainable by our staff – google docs or fusion tables is fine – we are open to suggestions. Please contact Derek Merleaux [de...@thewolf.fiu.edu] – we are seeking direct contacts from coders only please, no solicitations on behalf of third-parties, also you must be willing and able to become an approved vendor to Florida International University [details: http://finance.fiu.edu/purchasing/2vendor_forms.html] *-*
[CODE4LIB] Job: Front-End Web Designer / UI Specialist at East Carolina University
Front-End Web Designer / UI Specialist East Carolina University Greenville, NC The East Carolina University Libraries are looking for a Front-End Web Designer / UI Specialist. The qualifications we're looking for are the following: This individual works collaboratively with library departments to assess web development needs and user requirements to create engaging design concepts/visual assets. These concepts are then translated into working prototypes for a variety of projects involving the website, custom applications, and discovery interfaces. This individual then works with fellow team members to see prototypes through the development lifecycle. In addition to new development, this person works to improve usability and the overall user experience of the libraries existing web presence. This person works with marketing and other team members to harmonize existing projects under a unified brand, enhance the usability of existing projects, and improve the user experience as it relates to discovery of library services and resources. The individual in this position performs usability analyses to ensure optimal accessibility to library tools and resources. Preferred Experience: Bachelor's degree in a user experience discipline such as human-computer interaction design, interaction design, or related field.2-3 years of experience in interface and visual design (as demonstrated by a portfolio) OR 2-3 years of web design and development experience, with a solid understanding of information architecture, and UI design. Experience with interaction design, visual design, image formats and properties, web design, web fonts, mobile design; demonstrated experience with the relevant interaction and visual design tools (Adobe Creative Suite or equivalent). Demonstrated experience with HTML, CSS, and JavaScript, and responsive design; an understanding of cross browser issues, user interface design best practices, web and accessibility standards. Please encouraged interested and talented candidates to apply! Here is the permalink: [ecu.peopleadmin.com/applicants/Central?quickFind=73163 ](http://ecu.peopleadmin.com/applicants/Central?quickFind=73163) Brought to you by code4lib jobs: http://jobs.code4lib.org/job/11058/
Re: [CODE4LIB] WASSAIL / Assessment Tools
Here's what one of our instructional librarians said in response to this: At the University of Arizona Libraries, we piloted WASSAIL in 2012. There were a number of usability issues. The user interface was not intuitive; you couldn't preview created question items without creating a test - I believe this has now been fixed; couldn't easily import questions into D2L (Desire2Learn); upgrades to the database were made infrequently, since there was only one technology staff assigned to the product. My advice would be to contact the tech staff for questions about the tech aspects of WASSAIL. HTH, Mike -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Michael Schofield Sent: Wednesday, November 27, 2013 11:04 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] WASSAIL / Assessment Tools Hey see-for-ehl, It's still a year or two out, but re-accreditation awaits us at the end of the harrowing tunnel of library-work. One of my coworkers asked me to explore tools for assessment, linking me in example to WASAIL // Augustana Information Literacy from the University of Alberta (http://www.library.ualberta.ca/augustana/infolit/wassail/). I can't say that I'm particularly qualified to judge assessment tools and I was hoping you might have insight about WASSAIL or anything else. My only real concern is that I don't want to adopt or force a tool that will only be a temporary stop-gap or will somehow be silo'd (siloed?) from the rest of our applications and content so I'd hope the tool would be versatile and easy to adapt so that we could truly integrate it. I can judge this part. I don't know jack about assessment / assessment tools! Michael Schofield(@nova.edu) | Web Services Librarian | (954) 262-4536 Alvin Sherman Library, Research, and Information Technology Center // oh, and I write about the #libweb at www.ns4lib.com
[CODE4LIB] Online Course: Responsive Web Design for Libraries
Please feel free to share on other appropriate listservs, blogs, and with colleagues. Responsive Web Design for Librarieshttps://infopeople.org/civicrm/event/info?reset=1id=281 An Infopeople online course, January 28 to February 24, 2014 By next year there will be more active mobile devices than people on the planet. How can you ensure that your library's online services work as well on smartphones and tablets as they do on desktop computers? What about devices that haven’t been dreamed of yet? Instead of reacting to each new device, you can build websites that adapt to any device. Join Matt Reidsma, author of a new book on responsive design for libraries, to learn: - The basics of responsive web design (RWD) - How to compare RWD against other solutions to the “mobile problem” - How to implement best practices for website design in an increasingly mobile world even if you don't use RWD Instructor: Matthew Reidsma Fee: $75 for those in the California library community, $150 for all others. For a complete course description and to register go to https://infopeople.org/civicrm/event/info?reset=1id=281. -- --- Matthew Reidsma Web Services Librarian, Grand Valley State University matthewreidsma.com :: @mreidsma
[CODE4LIB] Fwd: PASIG Webinar - Digital Forensics and BitCurator
Possibly of interest: Digital Forensics webinar about Bitcurator ( http://wiki.bitcurator.net ) software. -- Forwarded message -- From: ASIST Continuing Education educat...@asis.org Date: Fri, Dec 6, 2013 at 1:51 AM Subject: PASIG Webinar - Digital Forensics and BitCurator To: jschnei...@pobox.com *PASIG Webinar - Digital Forensics and BitCurator* *Join us for a Webinar on December 12* *Free for ASIST Members, $20 for non-members* https://www3.gotomeeting.com/register/434136862 *Space is limited.* Reserve your Webinar seat now at: http://www.asis.org/Conferences/webinars/Webinar-PASIG-12-12-2013-register.html The BitCurator Project, a collaborative effort led by the School of Information and Library Science at the University of North Carolina at Chapel Hill and Maryland Institute for Technology in the Humanities at the University of Maryland, builds on previous work by addressing two fundamental needs and opportunities for collecting institutions: (1) integrating digital forensics tools and methods into the workflows and collection management environments of libraries, archives and museums and (2) supporting properly mediated public access to forensically acquired data. The project is developing and disseminating a suite of open source tools. These tools are currently being developed and tested in a Linux environment; the software on which they depend can readily be compiled for Windows environments (and in most cases are currently distributed as both source code and Windows binaries). We intend the majority of the development for BitCurator to support cross-platform use of the software. We are freely disseminating the software under an open source (GPL, Version 3) license. BitCurator provides users with two primary paths to integrate digital forensics tools and techniques into archival and library workflows. This webinar will introduce the BitCurator environment and briefly highlight support for mounting media as read-only, creating disk images, using Nautilus scripts to perform batch activities, generation of Digital Forensics XML (DFXML), generation of customized reports, and identification of sensitive data within data. Participants who are interested in trying out the software in advance can download and install the BitCurator environment by following the instructions at: http://wiki.bitcurator.net *Title:* *PASIG Webinar - Digital Forensics and BitCurator* *Date:* Thursday, December 12, 2013 *Time:* 11:30 AM - 12:30 PM EST After registering you will receive a confirmation e-mail containing information about joining the Webinar. *System Requirements* PC-based attendees Required: Windows® 8, 7, Vista, XP or 2003 Server Mac®-based attendees Required: Mac OS® X 10.6 or newer Mobile attendees Required: iPhone®, iPad®, Android™ phone or Android tablet If you do not wish to receive any further electronic marketing communications from ASIST you can opt-out completely, please note you will no longer receive Association updates or any conference information you may have subscribed to. To unsubscribe please send an e-mail to webin...@asis.org
Re: [CODE4LIB] REMINDER: Voting for Code4Lib 2014 Prepared Talks ends December 6th
I've activated all of the new code4lib.org accounts I could find over the last couple weeks. If you registered at code4lib.org but the account has not been activated yet, let me know what your username is. Or if you have any other account login issues. Ryan Wick ryanw...@gmail.com -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Kevin Beswick Sent: Tuesday, December 03, 2013 7:14 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] REMINDER: Voting for Code4Lib 2014 Prepared Talks ends December 6th The program committee would like to remind everyone to submit their votes for prepared talk proposals for Code4Lib 2014 in Raleigh, NC. Voting closes this Friday, December 6th by 11:59:59pm PDT. To vote for prepared talks, visit: http://vote.code4lib.org/election/28 For more details about voting, see the original message below. Thanks! Code4Lib Program Committee -- Forwarded message -- From: Trevor Thornton trevorthorn...@nypl.org Date: Mon, Nov 18, 2013 at 10:12 AM Subject: [CODE4LIB] Voting for Code4Lib 2014 Prepared Talks begins today! To: CODE4LIB@listserv.nd.edu The Code4Lib 2014 Program Committee is happy to announce that voting is now open for prepared talks. To vote, visit http://vote.code4lib.org/election/28, review the proposals, and assign points to those presentations you would like to see on the program this year. You will need to log in with your code4lib.org username and password in order to vote. If you have any issues with your account, please contact Ryan Wick at ryanw...@gmail.com. *Voting will end on Friday, December 6, 2013 at 11:59:59 PM PDT.* The 10 proposals with the most votes will be guaranteed a slot at the conference. Additional presentations will be selected by the Program Committee in an effort to ensure diversity in program content. Community votes will still weigh heavily in these decisions. For more information about Code4Lib 2014, visit http://code4lib.org/conference/2014/. -- Trevor Thornton Senior Applications Developer, NYPL Labs The New York Public Library phone: 212-621-0287 email: trevorthorn...@nypl.org