Re: [CODE4LIB] transforming marc to rdf

2013-12-05 Thread Christian Pietsch
Hi Eric,

you seem to have missed the Catmandu tutorial at SWIB13. Luckily there
is a basic tutorial and a demo online: http://librecat.org/

The demo happens to be about transforming MARC to RDF using the
Catmandu Perl framework. It gives you full flexibility by separating
the importer from the exporter and providing a domain specific
language for “fixing” the data in between. Catmandu also has easy
to use wrappers for popular search engines and databases (both SQL and
NoSQL), making it a complete ETL (extract, transform, load) toolkit.

Disclosure: I am a Catmandu contributor. It's free and open source
software.

Cheers,
Christian


On Wed, Dec 04, 2013 at 09:59:46PM -0500, Eric Lease Morgan wrote:
 Converting MARC to RDF has been more problematic. There are various
 tools enabling me to convert my original MARC into MARCXML and/or
 MODS. After that I can reportably use a few tools to convert to RDF:
 
   * MARC21slim2RDFDC.xsl [3] - functions, but even for
 my tastes the resulting RDF is too vanilla. [4]
 
   * modsrdf.xsl [5] - optimal, but when I use my
 transformation engine (Saxon), I do not get XML
 but rather plain text
 
   * BIBFRAME Tools [6] - sports nice ontologies, but
 the online tools won’t scale for large operations

-- 
  Christian Pietsch · http://www.ub.uni-bielefeld.de/~cpietsch/
  LibTec · Library Technology and Knowledge Management
  Bielefeld University Library, Bielefeld, Germany


Re: [CODE4LIB] transforming marc to rdf [comet]

2013-12-05 Thread Eric Lease Morgan
On Dec 4, 2013, at 10:29 PM, Corey A Harper corey.har...@nyu.edu wrote:

 Have you had a look at Ed Chamberlain's work on COMET:
 https://github.com/edchamberlain/COMET
 
 It's been a while since I've run this, but if I remember correctly, it was
 fairly easy-to-use.


Thank you for the pointer. I downloaded the COMET “suite”, and got good output, 
but only after I enhanced/tweaked the source code to require the Perl Encode 
module:

  ./marc2rdf_batch.pl pamphlets.marc

The result was a huge set of triples saved as RDF/Turtle. I then used a Java 
archive (RDF2RDF [0]) to painlessly convert the Turtle to RDF/XML. The process 
worked. It was “easy” more me, sort of, but it employes quite a number of 
sophisticated and underlying technologies. I could integrate everything into a 
whole, but… On to explore other options.

[1] RDF2RDF - http://www.l3s.de/~minack/rdf2rdf/

—
Sleepless In South Bend


Re: [CODE4LIB] transforming marc to rdf [mods_rdfizer]

2013-12-05 Thread Eric Lease Morgan
On Dec 4, 2013, at 10:29 PM, Corey A Harper corey.har...@nyu.edu wrote:

 Also, though much older, I seem to remember the Simile MARC RDFizer being
 a pretty straightforward one to run:
 http://simile.mit.edu/wiki/MARC/MODS_RDFizer
 
 MODS aficionados will point to some problems with some of it's choices for
 representing that data, but still a good starting point (IMO).


Again, thanks for the pointer. I downloaded MODS_RDFizer and got it to run, but 
it was a good thing that I already had mvn installed. The output did created an 
RDF/XML file, and I concur, the implemented ontology is “interesting”. The 
distribution include a possibly cool stylesheet — mods2rdf.xslt. Maybe I can 
use this. Hmm…  —Still Sleepless


Re: [CODE4LIB] transforming marc to rdf [mods_rdfizer]

2013-12-05 Thread Eric Lease Morgan
On Dec 5, 2013, at 6:54 AM, Eric Lease Morgan emor...@nd.edu wrote:

 http://simile.mit.edu/wiki/MARC/MODS_RDFizer
 
 ...The distribution includes a possibly cool stylesheet — mods2rdf.xslt.


Ah ha! The MODS_RDFizer’s mods2rdf.xslt file functioned very well against one 
of my MODS files:

  $ xsltproc mods2rdf.xslt pamphlets.mods  pamphlets.rdf

Mods2rdf.xslt could very easily be configured at the beginning of the file to 
suit the needs of a local “cultural heritage institution”. I like the use of 
XSL to create a serialized RDF as opposed to the use of an application because 
less infrastructure is needed to make things happen. 

—
Too Much Coffee?


Re: [CODE4LIB] transforming marc to rdf [catmandu]

2013-12-05 Thread Eric Lease Morgan
On Dec 5, 2013, at 3:07 AM, Christian Pietsch 
chr.pietsch+web4...@googlemail.com wrote:

 you seem to have missed the Catmandu tutorial at SWIB13. Luckily there
 is a basic tutorial and a demo online: http://librecat.org/


I did attend SWIB13, and I really wanted to go to the Catmandu workshop, but 
since I’m a Perl “affectionato I figured I could play with it later on my own. 
Instead I attended the workshop on provenance. (Travelogue is pending.)

In any event, playing with the Catmandu demo was insightful. [1] I see and 
understand the workflow: import data, fix it, store it, fix it, export it. I 
see how it is designed to use many import and export formats. The key to the 
software seems to be two-fold: 1) the ability to read and write Perl programs, 
and 2) understanding Catmandu’s “fix” language. There are great possibilities 
here for us Perl folks. Thank you for re-brining it to my attention.

[1] demo - http://demo.librecat.org

— 
Eric Lease Morgan


Re: [CODE4LIB] transforming marc to rdf

2013-12-05 Thread Ross Singer
Eric, I'm having a hard time figuring out exactly what you're hoping to get.

Going from MARC to RDF was my great white whale for years while Talis' main
business interests involved both of those (although not archival
collections).  Anything that will remodel MARC to (decent) RDF is going be:

   - Non-trivial to install
   - Non-trivial to use
   - Slow
   - Require massive amounts of memory/disk space

Choose any two.

Frankly, I don't see how you can generate RDF that anybody would want to
use from XSLT: where would your URIs come from?  What, exactly, are you
modeling?

I guess, to me, it would be a lot more helpful for you to take an archival
MARC record, and, by hand, build an RDF graph from it, then figure out your
mappings.  I just don't see any way to make it easy-to-use, at least, not
until you have an agreed upon model to map to.

-Ross.


On Thu, Dec 5, 2013 at 3:07 AM, Christian Pietsch 
chr.pietsch+web4...@googlemail.com wrote:

 Hi Eric,

 you seem to have missed the Catmandu tutorial at SWIB13. Luckily there
 is a basic tutorial and a demo online: http://librecat.org/

 The demo happens to be about transforming MARC to RDF using the
 Catmandu Perl framework. It gives you full flexibility by separating
 the importer from the exporter and providing a domain specific
 language for “fixing” the data in between. Catmandu also has easy
 to use wrappers for popular search engines and databases (both SQL and
 NoSQL), making it a complete ETL (extract, transform, load) toolkit.

 Disclosure: I am a Catmandu contributor. It's free and open source
 software.

 Cheers,
 Christian


 On Wed, Dec 04, 2013 at 09:59:46PM -0500, Eric Lease Morgan wrote:
  Converting MARC to RDF has been more problematic. There are various
  tools enabling me to convert my original MARC into MARCXML and/or
  MODS. After that I can reportably use a few tools to convert to RDF:
 
* MARC21slim2RDFDC.xsl [3] - functions, but even for
  my tastes the resulting RDF is too vanilla. [4]
 
* modsrdf.xsl [5] - optimal, but when I use my
  transformation engine (Saxon), I do not get XML
  but rather plain text
 
* BIBFRAME Tools [6] - sports nice ontologies, but
  the online tools won’t scale for large operations

 --
   Christian Pietsch · http://www.ub.uni-bielefeld.de/~cpietsch/
   LibTec · Library Technology and Knowledge Management
   Bielefeld University Library, Bielefeld, Germany



Re: [CODE4LIB] Discovery layer for Primo

2013-12-05 Thread Patrick Hartsfield
Everything I've heard from Ex Libris is that Alma is discovery layer agnostic, 
though they understandably want you to use Primo since it's their product. 
Perhaps the differentiation is that they won't host third-party discovery 
layers at this time? If you wanted to use Blacklight/VuFind/etc. it should work 
but would have to be self-hosted.


Re: [CODE4LIB] transforming marc to rdf [to batch or not to batch]

2013-12-05 Thread Eric Lease Morgan
When exposing sets of MARC records as linked data, do you think it is better to 
expose them in batch (collection) files or as individual RDF serializations? To 
bastardize the Bard — “To batch or not to batch? That is the question.”

Suppose I am a medium-sized academic research library. Suppose my collection is 
comprised of approximately 3.5 million bibliographic records. Suppose I want to 
expose those records via linked data. Suppose further that this will be done by 
“simply” making RDF serialization files (XML, Turtle, etc.) accessible via an 
HTTP filesystem. No scripts. No programs. No triple stores. Just files on an 
HTTP file system coupled with content negotiation. Given these assumptions, 
would you:

  1. create batches of MARC records, convert them to MARCXML
 and then to RDF, and save these files to disc, or

  2. parse the batches of MARC record sets into individual
 records, convert them into MARCXML and then RDF, and
 save these files to disc

Option #1 would require heavy lifting against large files, but the number of 
resulting files to save to disc would be relatively few — reasonably managed in 
a single directory on disc. On the other hand, individual URIs pointing to 
individual serializations would not be accessible. They would only be 
accessible by retrieving the collection file in which they reside. Moreover, a 
mapping of individual URIs to collection files would need to be maintained. 

Option #2 would be easier on the computing resources because processing little 
files is generally easier than processing bigger ones. On the other hand, the 
number of files generated by this option is not easily be managed without the 
use of a sophisticated directory structure. (It is not feasible to put 3.5 
million files in a single directory.) But I would still need to create a 
mapping from URI to directory.

In either case, I would probably create a bunch of site map files denoting the 
locations of my serializations — YAP (Yet Another Mapping).

I’m leaning towards Option #2 because individual URIs could be resolved more 
easily with “simple” content negotiation.

(Given my particular use case — archival MARC records — I don’t think I’d 
really have more than a few thousand items, but I’m asking the question on a 
large scale anyway.)

—
Eric Morgan


Re: [CODE4LIB] book cover api

2013-12-05 Thread Kaile Zhu
On a second thought, IIIF won't work for my situation either, though it offers 
much more flexible manipulation on an individual base.

My situation is: I have a loop to list many books, wanting a book cover image 
for each book.  

Kelly

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Keith 
Jenkins
Sent: 2013年12月4日 13:50
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] book cover api

So, any bets on which book cover image provider will be the first to implement 
IIIF?

http://www-sul.stanford.edu/iiif/image-api/1.1/

Keith


On Wed, Dec 4, 2013 at 2:41 PM, Karen Coyle li...@kcoyle.net wrote:
 Open Library book covers come in S, M and L -

 https://openlibrary.org/dev/docs/api/covers

 Of course, if what you want isn't exactly one of those...

 kc


 On 12/4/13 9:34 AM, Kaile Zhu wrote:

 A while ago, we had a discussion about book cover APIs.  I tried some 
 of those mentioned and found they are working to some degree, but 
 none of them would offer the size I want.  The flexibility of the size is 
 just not there.
 The size I am looking for is like this:
 http://img1.imagesbn.com/p/9780316227940_p0_v2_s114x166.JPG

 Anybody has found a way of implementing book cover api to your 
 specifications successfully and is willing to share that with me?  
 Off-line if you want.  Much appreciation.  Thanks.

 Kelly Zhu
 405-974-5957
 kz...@uco.edu

 **Bronze+Blue=Green** The University of Central Oklahoma is Bronze, 
 Blue, and Green! Please print this e-mail only if absolutely necessary!

 **CONFIDENTIALITY** This e-mail (including any attachments) may 
 contain confidential, proprietary and privileged information. Any 
 unauthorized disclosure or use of this information is prohibited.


 --
 Karen Coyle
 kco...@kcoyle.net http://kcoyle.net
 m: 1-510-435-8234
 skype: kcoylenet




**Bronze+Blue=Green** The University of Central Oklahoma is Bronze, Blue, and 
Green! Please print this e-mail only if absolutely necessary! 

**CONFIDENTIALITY** This e-mail (including any attachments) may contain 
confidential, proprietary and privileged information. Any unauthorized 
disclosure or use of this information is prohibited.


Re: [CODE4LIB] transforming marc to rdf

2013-12-05 Thread Eric Lease Morgan
On Dec 5, 2013, at 8:55 AM, Ross Singer rossfsin...@gmail.com wrote:

 Eric, I'm having a hard time figuring out exactly what you're hoping to get.
 
 Going from MARC to RDF was my great white whale for years while Talis' main
 business interests involved both of those (although not archival
 collections).  Anything that will remodel MARC to (decent) RDF is going be:
 
   - Non-trivial to install
   - Non-trivial to use
   - Slow
   - Require massive amounts of memory/disk space
 
 Choose any two.
 
 Frankly, I don't see how you can generate RDF that anybody would want to
 use from XSLT: where would your URIs come from?  What, exactly, are you
 modeling?
 
 I guess, to me, it would be a lot more helpful for you to take an archival
 MARC record, and, by hand, build an RDF graph from it, then figure out your
 mappings.  I just don't see any way to make it easy-to-use, at least, not
 until you have an agreed upon model to map to.


Ross, good questions. I’m hoping to articulate and implement a simple and 
functional method for exposing EAD and MARC metadata as linked data. “Simple 
and functional” are the operative words; I’m not necessarily looking for 
“fast”, “best” nor “perfect”. I am trying to articulate something that requires 
the least amount of infrastructure and technical expertise.

Reasonable RDF through XSLT? Good point. I like the use of XSLT because it does 
not require very much technical infrastructure — just ubiquitous XSLT 
processors like Saxon or xsltproc. I have identified two or three stylesheets 
transforming MARCXML/MODS into RDF/XML.

  1. The first comes from the Library of Congress and uses Dublin
 Core as its ontology, but the resulting RDF has no URIs and
 the Dublin Core is not good enough, even for my tastes. [1]

  2. The second also comes from the Library of Congress, and it
 uses a richer, more standard ontology, but I can’t get it to
 work. All I get as output is a plain text file. I must be
 doing something wrong. [2]

  3. The found the third stylesheet buried the MARC/MODS RDFizer.
 The sheet uses XSLT 1.0 which is good for my xsltproc-like
 tools. I get output, which is better than Sheet #2. The
 ontology is a bit MIT-specific, but it is one heck of a lot
 richer than Sheet #1. Moreover, the RDF includes URIs. [3, 4]

In none of these cases will the ontology be best nor perfect, but for right now 
I don’t care. The ontology is good enough. Heck, the ontologies don’t even come 
close to the ontology I get when transforming my EAD to RDF using the Archives 
Hub stylesheet. [5] I just want to expose the content as linked data. Somebody 
else — the community — can come behind to improve the stylesheets and their 
ontologies. 

Where will I get the URIs from? I will get them by combining some sort of 
unique code (like an OCLC symbol) or namespace with the value of the MARC 
records' 001 fields.

Here is an elaboration of my original recipe for making MARC metadata 
accessible via linked data:

  1. obtain a set of MARC records
  2. parse out a record from the set
  3. convert it to MARCXML
  4. transform MARCXML into HTML
  5. transform MARCXML into RDF (probably through MODS first)
  6. save HTML and RDF to disc
  7. update a mapping file / data structure denoting where things are located
  7. go to Step #2 for each record in the set
  8. use the mapping to create a set of site map files
  9. use the mapping to support HTTP content negotiation
 10. create an index.html file allowing humans to browse the collection as well 
as point robots to the RDF
 11. for extra credit, import all the RDF into a triple store and provide 
access via SPARQL

I think I can do the same thing with EAD files. Moreover, I think I an do this 
with a small number of (Perl) scripts easily readable by others enabling them 
to implement the scripts in a programming language of their choice. Once I get 
this far metadata experts can improve the ontologies, and computer scientists 
can improve the infrastructure. In the meantime the linked data can be 
harvested for the good purposes link data was articulated.

It is in my head. It really is. All I need is the time, focus, and energy to 
implement it. On my mark. Get set. Go.


[1] MARC21slim2RDFDC.xsl - 
http://www.loc.gov/standards/marcxml/xslt/MARC21slim2RDFDC.xsl
[2] modsrdf.xsl - 
http://www.loc.gov/standards/mods/modsrdf/xsl-files/modsrdf.xsl
[3] mods2rdf.xslt - http://infomotions.com/tmp/mods2rdf.xslt
[4] MARC/MODS RDFizer - http://simile.mit.edu/wiki/MARC/MODS_RDFizer
[5] ead2rdf.xsl - http://data.archiveshub.ac.uk/xslt/ead2rdf.xsl

— 
Eric Lease Morgan


Re: [CODE4LIB] transforming marc to rdf

2013-12-05 Thread Mark A. Matienzo
On Thu, Dec 5, 2013 at 11:11 AM, Eric Lease Morgan emor...@nd.edu wrote:


 I’m hoping to articulate and implement a simple and functional method for
 exposing EAD and MARC metadata as linked data.


Isn't the point of this to expose archival description as linked data? What
about description maintained in applications like a collection management
system, say, ArchivesSpace or Archivists' Toolkit?

Mark

--
Mark A. Matienzo m...@matienzo.org
Director of Technology, Digital Public Library of America


Re: [CODE4LIB] transforming marc to rdf

2013-12-05 Thread Eric Lease Morgan
On Dec 5, 2013, at 11:17 AM, Mark A. Matienzo mark.matie...@gmail.com wrote:

 I’m hoping to articulate and implement a simple and functional method for
 exposing EAD and MARC metadata as linked data.
 
 Isn't the point of this to expose archival description as linked data? What
 about description maintained in applications like a collection management
 system, say, ArchivesSpace or Archivists' Toolkit?


Good question! At the very least, these applications (ArchivesSpace, 
Archivists’ Toolkit, etc.) can regularly and systematically export their data 
as EAD, and the EAD can be made available as linked data. It would be ideal if 
the applications where to natively make their metadata available as linked 
data, but exporting their content as EAD is a functional stopgap solution. 
—Eric Morgan


[CODE4LIB] Job: Database Administrator/IT Specialist, BDLSS at University of Oxford

2013-12-05 Thread jobs
Database Administrator/IT Specialist, BDLSS
University of Oxford
Oxford

We are seeking an experienced Database Administrator, to join our established
Bodleian Digital Library Systems and Support team. The team provides support
and development services for the libraries' core service applications
including the integrated library system, resource discovery platforms and
admissions system.

  
You will be responsible for the management of key library system databases,
principally the Bodleian Libraries Admissions card service, and will
additionally be required to provide support to the library catalogue and
resource discovery services.

  
You will have significant experience in managing SQL Server 2008 applications,
be familiar with use of the Unix/Linux command line and have scripting
experience. Some knowledge of web technologies including HTML and CSS is
essential, as are good problem-solving, analytical and communication skills.

  
Applications are to be made online. To apply for this role and for further
details, including a job description and selection criteria, please click on
the link below:

  
[https://www.recruit.ox.ac.uk/pls/hrisliverecruit/erq_jobspec_version_4.jobspe
c?p_id=110819](https://www.recruit.ox.ac.uk/pls/hrisliverecruit/erq_jobspec_ve
rsion_4.jobspec?p_id=110819)

  
You will be required to upload a supporting statement as part of your online
application. Only applications received before midday on Monday 6th January
2014 can be considered. Interviews are anticipated to take place during w/c 13
January 2014.



Brought to you by code4lib jobs: http://jobs.code4lib.org/job/11057/


Re: [CODE4LIB] transforming marc to rdf

2013-12-05 Thread Mark A. Matienzo
On Thu, Dec 5, 2013 at 11:26 AM, Eric Lease Morgan emor...@nd.edu wrote:


 Good question! At the very least, these applications (ArchivesSpace,
 Archivists’ Toolkit, etc.) can regularly and systematically export their
 data as EAD, and the EAD can be made available as linked data. It would be
 ideal if the applications where to natively make their metadata available
 as linked data, but exporting their content as EAD is a functional stopgap
 solution. —Eric Morgan


Wouldn't it make more sense, especially with a system like ArchivesSpace,
which provides a backend HTTP API and a public UI, to publish linked data
directly instead of adding yet another stopgap?

Mark

--
Mark A. Matienzo m...@matienzo.org
Director of Technology, Digital Public Library of America


Re: [CODE4LIB] transforming marc to rdf

2013-12-05 Thread Eric Lease Morgan
On Dec 5, 2013, at 11:33 AM, Mark A. Matienzo mark.matie...@gmail.com wrote:

 At the very least, these applications (ArchivesSpace,
 Archivists’ Toolkit, etc.) can regularly and systematically export their
 data as EAD, and the EAD can be made available as linked data.
 
 Wouldn't it make more sense, especially with a system like ArchivesSpace,
 which provides a backend HTTP API and a public UI, to publish linked data
 directly instead of adding yet another stopgap?


Publishing via a content management system would make more sense, if:

  1. the archivist uses the specific content management system
  2. the content management system supported the functionality

“There is more than one way to skin a cat.” There are advantages and 
disadvantages to every software solution.

—
Eric


Re: [CODE4LIB] transforming marc to rdf

2013-12-05 Thread Corey A Harper
With apologies to Eric  to others from the LiAM project, I feel like I
want to jump in here with a little more context.

Eric, or Aaron, or Anne, please feel free to correct any of what I say
below.

I agree with the points made and concerns raised by both Ross  Mark --
most significantly, that a sustainable infrastructure for linked archival
metadata is not going to come from an XSLT stylesheet. However, I also see
tremendous value in what Eric is putting together here.

The prospectus for the LiAM project, which is the context for Eric's
questions, is about developing guiding principles and educational tools for
the archival community to better understand, prepare for, and contribute to
the kind of infrastructure both Ross  Mark are talking about:
http://sites.tufts.edu/liam/deliverables/prospectus-for-linked-archival-metadata-a-guidebook/

While I agree that converting legacy data in EAD  MARC formats to RDF is
not the approach this work will take in the future, I also believe that
these are formats that the archival community is very familiar with, and
XSLT is a tool that many archivists work with regularly. A workflow for
that community to experiment is a laudable goal.

In short, I think we need approaches that illustrate the potential of
linked data in archives, to highlight some of the shortcomings in our
current metadata management frameworks, to help archivists be in a position
to get their metadata ready for what Mark is describing in the context of
ArchivesSpace (e.g. please use id attributes in c tags!!), and to have a
more complete picture of why doing so is of some value.

Sorry for the long message, and I hope that the context is helpful.

Regards,
-Corey



On Thu, Dec 5, 2013 at 11:33 AM, Mark A. Matienzo
mark.matie...@gmail.comwrote:

 On Thu, Dec 5, 2013 at 11:26 AM, Eric Lease Morgan emor...@nd.edu wrote:

 
  Good question! At the very least, these applications (ArchivesSpace,
  Archivists’ Toolkit, etc.) can regularly and systematically export their
  data as EAD, and the EAD can be made available as linked data. It would
 be
  ideal if the applications where to natively make their metadata available
  as linked data, but exporting their content as EAD is a functional
 stopgap
  solution. —Eric Morgan
 

 Wouldn't it make more sense, especially with a system like ArchivesSpace,
 which provides a backend HTTP API and a public UI, to publish linked data
 directly instead of adding yet another stopgap?

 Mark

 --
 Mark A. Matienzo m...@matienzo.org
 Director of Technology, Digital Public Library of America




-- 
Corey A Harper
Metadata Services Librarian
New York University Libraries
20 Cooper Square, 3rd Floor
New York, NY 10003-7112
212.998.2479
corey.har...@nyu.edu


Re: [CODE4LIB] transforming marc to rdf

2013-12-05 Thread McAulay, Elizabeth
I've been following this conversation as a non-coder. I'm really interested in 
getting a better understanding of linked data and how to use existing metadata 
for proof of concept linked data outputs. So, I totally think Eric's approaches 
are valuable and would be something I would use. I also understand there are 
many ways to do something better and more in the flow. So, just encouraging 
you all to keep posting thoughts in both directions!

Best,
Lisa 
-
Elizabeth Lisa McAulay
Librarian for Digital Collection Development
UCLA Digital Library Program
http://digital.library.ucla.edu/
email: emcaulay [at] library.ucla.edu

From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Eric Lease 
Morgan [emor...@nd.edu]
Sent: Thursday, December 05, 2013 8:57 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] transforming marc to rdf

On Dec 5, 2013, at 11:33 AM, Mark A. Matienzo mark.matie...@gmail.com wrote:

 At the very least, these applications (ArchivesSpace,
 Archivists’ Toolkit, etc.) can regularly and systematically export their
 data as EAD, and the EAD can be made available as linked data.

 Wouldn't it make more sense, especially with a system like ArchivesSpace,
 which provides a backend HTTP API and a public UI, to publish linked data
 directly instead of adding yet another stopgap?


Publishing via a content management system would make more sense, if:

  1. the archivist uses the specific content management system
  2. the content management system supported the functionality

“There is more than one way to skin a cat.” There are advantages and 
disadvantages to every software solution.

—
Eric


Re: [CODE4LIB] transforming marc to rdf

2013-12-05 Thread Mark A. Matienzo
On Thu, Dec 5, 2013 at 11:57 AM, Eric Lease Morgan emor...@nd.edu wrote:

 On Dec 5, 2013, at 11:33 AM, Mark A. Matienzo mark.matie...@gmail.com
 wrote:

  Wouldn't it make more sense, especially with a system like ArchivesSpace,
  which provides a backend HTTP API and a public UI, to publish linked data
  directly instead of adding yet another stopgap?


 Publishing via a content management system would make more sense, if:

   1. the archivist uses the specific content management system
   2. the content management system supported the functionality

 “There is more than one way to skin a cat.” There are advantages and
 disadvantages to every software solution.


I recognized that not everyone uses a collection management system and
instead may author description using EAD or something else directly, but I
think we really need to acknowledge the affordance of that kind of software
here.

I can tell you for certain there are certain aspects of the ArchivesSpace
data model that are not serializable in any good way - or at all - using
EAD or MARC.

Per Corey's message:

I have no objection in principle to using XSLT to provide examples of ways
to do this transformation (I know lots of people have piles of existing
EAD) as long as the resulting data is acknowledged to be less than ideal.
EAD is also not a data model, it's a document model for a finding aid. EAD3
will improve this somewhat, but it's still not a representation of a
conceptual model of archival entities.

My concern about using something like XSLT *specifically* to transform
archival description stored in MARC is that the existing stylesheets assume
that the MARC description is bibliographic description. Archival
description is not bibliographic description.

Mark


Re: [CODE4LIB] transforming marc to rdf

2013-12-05 Thread Ross Singer
On Thu, Dec 5, 2013 at 11:57 AM, Eric Lease Morgan emor...@nd.edu wrote:


 “There is more than one way to skin a cat.” There are advantages and
 disadvantages to every software solution.


I think what Mark and I are trying to say is that the first step to this
solution is not by applying software at existing data, but by trying to
figure out the problem you're actually trying to solve.  Any linked data
future cannot be a simple as a technologist giving some magic tool to
archivists and librarians.

You still haven't really answered my question about what you're hoping to
achieve and who stands to benefit from it.  I don't see how assigning a
bunch of arbitrary identifiers, properties, and values to a description of
a collection of archival materials (especially since you're talking about
doing this in XSLT, so your archival collections can't even really be
related to /each other/ much less anything else).

Who is going to use going to use this data?  What are they supposed to do
with it?  What will libraries and archives get from it?

I am certainly not above academic exercises (or without my own), but I
absolutely can see *no* beneficial archival linked data created simply by
pointing an XSLT at a bunch of EAD and MARCXML and I certainly can't
without a clear vision of the model that said XSLT is supposed to generate.
 The key part here is the data model, and taking a 'software
solution'-first approach does nothing to address that.

-Ross.


Re: [CODE4LIB] transforming marc to rdf

2013-12-05 Thread Kevin Ford

* BIBFRAME Tools [6] - sports nice ontologies, but
  the online tools won’t scale for large operations
-- The code running the transformation at [6] is available here:

https://github.com/lcnetdev/marc2bibframe

We've run several million records through it at one time.  As with 
everything, the data needs to be properly prepared and we have a script 
that processes those millions in smaller (but still sizeable) batches.


Yours,
Kevin


On 12/04/2013 09:59 PM, Eric Lease Morgan wrote:

I have to eat some crow, and I hope somebody here can give me some advice for 
transforming MARC to RDF.

I am in the midst of writing a book describing the benefits of linked data for 
archives. Archival metadata usually comes in two flavors: EAD and MARC. I found 
a nifty XSL stylesheet from the Archives Hub (that’s in the United Kingdom) 
transforming EAD to RDF/XML. [1] With a bit of customization I think it could 
be used quite well for just about anybody with EAD files. I have retained a 
resulting RDF/XML file online. [2]

Converting MARC to RDF has been more problematic. There are various tools 
enabling me to convert my original MARC into MARCXML and/or MODS. After that I 
can reportably use a few tools to convert to RDF:

   * MARC21slim2RDFDC.xsl [3] - functions, but even for
 my tastes the resulting RDF is too vanilla. [4]

   * modsrdf.xsl [5] - optimal, but when I use my
 transformation engine (Saxon), I do not get XML
 but rather plain text

   * BIBFRAME Tools [6] - sports nice ontologies, but
 the online tools won’t scale for large operations

In short, I have discovered nothing that is “easy-to-use”. Can you provide me 
with any other links allowing me to convert MARC to serialized RDF?

[1] ead2rdf.xsl - http://data.archiveshub.ac.uk/xslt/ead2rdf.xsl
[2] transformed EAD file - http://infomotions.com/tmp/una-ano.rdf
[3] MARC21slim2RDFDC.xsl - 
http://www.loc.gov/standards/marcxml/xslt/MARC21slim2RDFDC.xsl
[4] vanilla RDF - http://infomotions.com/tmp/pamphlets.rdf
[5] modsrdf.xsl - 
http://www.loc.gov/standards/mods/modsrdf/xsl-files/modsrdf.xsl
[6] BIBFRAME Tools - http://bibframe.org/tools/transform/start

—
Eric Lease Morgan



Re: [CODE4LIB] transforming marc to rdf

2013-12-05 Thread Kevin Ford

 Anything that will remodel MARC to (decent) RDF is going be:

 - Non-trivial to install
 - Non-trivial to use
 - Slow
 - Require massive amounts of memory/disk space

 Choose any two.
-- I'll second this.


 Frankly, I don't see how you can generate RDF that anybody would want to
 use from XSLT: where would your URIs come from?  What, exactly, are you
 modeling?
-- Our experience getting to good, URI rich RDF has been basically a 
two-step process.  First there is the raw conversion, which certainly 
results in verbose blank-node-rich RDF, but we follow that pass with a 
second one during which blank nodes are replaced with URIs.


This has most certainly been the case with BIBFRAME because X number of 
MARC records may represent varying manifestations of a single work.  We 
don't want X number of instances (manifestations basically) referencing 
X number of works in the end, but X number of instances referencing 1 
work (all other things being equal).  We consolidate - for the lack of a 
better word - X number of works created in the first pass into 1 work 
(identified by an HTTP URI) and then we make sure X number of instances 
point to that one work, removing all the duplicate blank-node-identified 
resources created during the first pass.


Granted this consolidation scenario is not scalable without a fairly 
robust backend solution, but the process at bibframe.org (the code on 
github) nevertheless does the type of consolidation described above in 
memory with small MARC collections.


Yours,
Kevin





On 12/05/2013 08:55 AM, Ross Singer wrote:

Eric, I'm having a hard time figuring out exactly what you're hoping to get.

Going from MARC to RDF was my great white whale for years while Talis' main
business interests involved both of those (although not archival
collections).  Anything that will remodel MARC to (decent) RDF is going be:

- Non-trivial to install
- Non-trivial to use
- Slow
- Require massive amounts of memory/disk space

Choose any two.

--



Frankly, I don't see how you can generate RDF that anybody would want to
use from XSLT: where would your URIs come from?  What, exactly, are you
modeling?

I guess, to me, it would be a lot more helpful for you to take an archival
MARC record, and, by hand, build an RDF graph from it, then figure out your
mappings.  I just don't see any way to make it easy-to-use, at least, not
until you have an agreed upon model to map to.

-Ross.


On Thu, Dec 5, 2013 at 3:07 AM, Christian Pietsch 
chr.pietsch+web4...@googlemail.com wrote:


Hi Eric,

you seem to have missed the Catmandu tutorial at SWIB13. Luckily there
is a basic tutorial and a demo online: http://librecat.org/

The demo happens to be about transforming MARC to RDF using the
Catmandu Perl framework. It gives you full flexibility by separating
the importer from the exporter and providing a domain specific
language for “fixing” the data in between. Catmandu also has easy
to use wrappers for popular search engines and databases (both SQL and
NoSQL), making it a complete ETL (extract, transform, load) toolkit.

Disclosure: I am a Catmandu contributor. It's free and open source
software.

Cheers,
Christian


On Wed, Dec 04, 2013 at 09:59:46PM -0500, Eric Lease Morgan wrote:

Converting MARC to RDF has been more problematic. There are various
tools enabling me to convert my original MARC into MARCXML and/or
MODS. After that I can reportably use a few tools to convert to RDF:

   * MARC21slim2RDFDC.xsl [3] - functions, but even for
 my tastes the resulting RDF is too vanilla. [4]

   * modsrdf.xsl [5] - optimal, but when I use my
 transformation engine (Saxon), I do not get XML
 but rather plain text

   * BIBFRAME Tools [6] - sports nice ontologies, but
 the online tools won’t scale for large operations


--
   Christian Pietsch · http://www.ub.uni-bielefeld.de/~cpietsch/
   LibTec · Library Technology and Knowledge Management
   Bielefeld University Library, Bielefeld, Germany



[CODE4LIB] coder needed: JavaScript / Google Maps API V.3 Programming project

2013-12-05 Thread Derek Merleaux
know any folks w/ Google Maps API skills looking for an odd job? pls. fwd

thanks!

=Derek


*JavaScript / Google Maps API V.3 Programming Job : Wolfsonian – FIU
(www.wolfsonian.org http://www.wolfsonian.org)*



We are putting together a site to provide public access to the
high-resolution digital image of a painting called the Menneske Pyramide
[Human Pyramid], by Harald Rudyard Engman, created in Copenhagen, Denmark
in 1941.
http://www.wolfsonian.org/explore/collections/menneske-pyramide-human-pyramid

This painting (which is part of our permanent collection) is full of
socio-political messages relating to Denmark's unfortunate involvement in
WWII. Many of the figures depicted in the painting are caricatures or
likenesses of figures from Danish history and mythology as well as current
political and cultural figures from the early part of the 20th century. In
the past some curators and researchers have identified some of these
figures and we are hoping to utilize the knowledge of the crowd to fill in
some more of the blanks.


To this end we have created a prototype site that uses the custom map
feature of the Google Maps API v.3 which allows most web browsers quick and
smooth zooming and panning of this near-gigapixel image. Rudimentary coding
has placed some permanent map-pin markers with pop-up info boxes containing
text annotations for the figures we have already identified.



What we need from you:

We want the public version to allow people to contribute annotations by
dragging a new marker-pin onto the image and then entering text in that
marker’s info box – this annotation would be submitted to a moderation
queue and if approved would show up on the public version for all to see.
How this is managed is up to you as long as it’s easily sustainable by our
staff – google docs or fusion tables is fine – we are open to suggestions.



Please contact Derek Merleaux [de...@thewolf.fiu.edu] – we are seeking
direct contacts from coders only please, no solicitations on behalf of
third-parties, also you must be willing and able to become an approved
vendor to Florida International University [details:
http://finance.fiu.edu/purchasing/2vendor_forms.html]

*-*


[CODE4LIB] Job: Front-End Web Designer / UI Specialist at East Carolina University

2013-12-05 Thread jobs
Front-End Web Designer / UI Specialist
East Carolina University
Greenville, NC

The East Carolina University Libraries are looking for a Front-End Web
Designer / UI Specialist. The qualifications we're looking
for are the following:

  
This individual works collaboratively with library departments to assess web
development needs and user requirements to create engaging design
concepts/visual assets. These concepts are then translated into working
prototypes for a variety of projects involving the website, custom
applications, and discovery interfaces. This individual then works with fellow
team members to see prototypes through the development lifecycle. In addition
to new development, this person works to improve usability and the overall
user experience of the libraries existing web presence. This person works with
marketing and other team members to harmonize existing projects under a
unified brand, enhance the usability of existing projects, and improve the
user experience as it relates to discovery of library services and resources.
The individual in this position performs usability analyses to ensure optimal
accessibility to library tools and resources.

  
Preferred Experience: Bachelor's degree in a user experience discipline such
as human-computer interaction design, interaction design, or related field.2-3
years of experience in interface and visual design (as demonstrated by a
portfolio) OR 2-3 years of web design and development experience, with a solid
understanding of information architecture, and UI design.
Experience with interaction design, visual design, image formats and
properties, web design, web fonts, mobile design; demonstrated experience with
the relevant interaction and visual design tools (Adobe Creative Suite or
equivalent). Demonstrated experience with HTML, CSS, and
JavaScript, and responsive design; an understanding of cross browser issues,
user interface design best practices, web and accessibility standards.

  
Please encouraged interested and talented candidates to apply!

  
Here is the permalink: [ecu.peopleadmin.com/applicants/Central?quickFind=73163
](http://ecu.peopleadmin.com/applicants/Central?quickFind=73163)



Brought to you by code4lib jobs: http://jobs.code4lib.org/job/11058/


Re: [CODE4LIB] WASSAIL / Assessment Tools

2013-12-05 Thread Hagedon, Mike
Here's what one of our instructional librarians said in response to this:

At the University of Arizona Libraries, we piloted WASSAIL in 2012.  There 
were a number of usability issues.  The user interface was not intuitive; you 
couldn't preview created question items without creating a test - I believe 
this has now been fixed; couldn't easily import questions into D2L 
(Desire2Learn); upgrades to the database were made infrequently, since there 
was only one technology staff assigned to the product.  My advice would be to 
contact the tech staff  for questions about the tech aspects of WASSAIL.

HTH,
Mike

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Michael 
Schofield
Sent: Wednesday, November 27, 2013 11:04 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: [CODE4LIB] WASSAIL / Assessment Tools

Hey see-for-ehl,

It's still a year or two out, but re-accreditation awaits us at the end of the 
harrowing tunnel of library-work. One of my coworkers asked me to explore tools 
for assessment, linking me in example to WASAIL // Augustana Information 
Literacy from the University of Alberta 
(http://www.library.ualberta.ca/augustana/infolit/wassail/). I can't say that 
I'm particularly qualified to judge assessment tools and I was hoping you might 
have insight about WASSAIL or anything else.

My only real concern is that I don't want to adopt or force a tool that will 
only be a temporary stop-gap or will somehow be silo'd (siloed?) from the rest 
of our applications and content so I'd hope the tool would be versatile and 
easy to adapt so that we could truly integrate it. I can judge this part.

I don't know jack about assessment / assessment tools!

Michael Schofield(@nova.edu) | Web Services Librarian | (954) 262-4536 Alvin 
Sherman Library, Research, and Information Technology Center

// oh, and I write about the #libweb at www.ns4lib.com


[CODE4LIB] Online Course: Responsive Web Design for Libraries

2013-12-05 Thread Matthew Reidsma
Please feel free to share on other appropriate listservs, blogs, and with
colleagues.



Responsive Web Design for
Librarieshttps://infopeople.org/civicrm/event/info?reset=1id=281

An Infopeople online course, January 28 to February 24, 2014

By next year there will be more active mobile devices than people on the
planet. How can you ensure that your library's online services work as well
on smartphones and tablets as they do on desktop computers? What about
devices that haven’t been dreamed of yet? Instead of reacting to each new
device, you can build websites that adapt to any device. Join Matt Reidsma,
author of a new book on responsive design for libraries, to learn:


   - The basics of responsive web design (RWD)
   - How to compare RWD against other solutions to the “mobile problem”
   - How to implement best practices for website design in an increasingly
   mobile world even if you don't use RWD

Instructor: Matthew Reidsma

Fee: $75 for those in the California library community, $150 for all others.

For a complete course description and to register go to
https://infopeople.org/civicrm/event/info?reset=1id=281.

--
---
Matthew Reidsma
Web Services Librarian, Grand Valley State University
matthewreidsma.com :: @mreidsma


[CODE4LIB] Fwd: PASIG Webinar - Digital Forensics and BitCurator

2013-12-05 Thread Jodi Schneider
Possibly of interest: Digital Forensics webinar about Bitcurator (
http://wiki.bitcurator.net ) software.

-- Forwarded message --
From: ASIST Continuing Education educat...@asis.org
Date: Fri, Dec 6, 2013 at 1:51 AM
Subject: PASIG Webinar - Digital Forensics and BitCurator
To: jschnei...@pobox.com




   *PASIG Webinar - Digital Forensics and BitCurator*

*Join us for a Webinar on December 12*

*Free for ASIST Members, $20 for non-members*

https://www3.gotomeeting.com/register/434136862

*Space is limited.*
Reserve your Webinar seat now at:
http://www.asis.org/Conferences/webinars/Webinar-PASIG-12-12-2013-register.html

The BitCurator Project, a collaborative effort led by the School of
Information and Library Science at the University of North Carolina at
Chapel Hill and Maryland Institute for Technology in the Humanities at the
University of Maryland, builds on previous work by addressing two
fundamental needs and opportunities for collecting institutions: (1)
integrating digital forensics tools and methods into the workflows and
collection management environments of libraries, archives and museums and
(2) supporting properly mediated public access to forensically acquired
data.

The project is developing and disseminating a suite of open source tools.
These tools are currently being developed and tested in a Linux
environment; the software on which they depend can readily be compiled for
Windows environments (and in most cases are currently distributed as both
source code and Windows binaries). We intend the majority of the
development for BitCurator to support cross-platform use of the software.
We are freely disseminating the software under an open source (GPL, Version
3) license. BitCurator provides users with two primary paths to integrate
digital forensics tools and techniques into archival and library workflows.

This webinar will introduce the BitCurator environment and briefly
highlight support for mounting media as read-only, creating disk images,
using Nautilus scripts to perform batch activities, generation of Digital
Forensics XML (DFXML), generation of customized reports, and identification
of sensitive data within data.

Participants who are interested in trying out the software in advance can
download and install the BitCurator environment by following the
instructions at: http://wiki.bitcurator.net



*Title:*

*PASIG Webinar - Digital Forensics and BitCurator*

*Date:*

Thursday, December 12, 2013

*Time:*

11:30 AM - 12:30 PM EST

   After registering you will receive a confirmation e-mail containing
information about joining the Webinar.



*System Requirements*
PC-based attendees
Required: Windows® 8, 7, Vista, XP or 2003 Server

Mac®-based attendees
Required: Mac OS® X 10.6 or newer

Mobile attendees
Required: iPhone®, iPad®, Android™ phone or Android tablet

If you do not wish to receive any further electronic marketing
communications from ASIST you can opt-out completely, please note you will
no longer receive Association updates or any conference information you may
have subscribed to. To unsubscribe please send an e-mail to
webin...@asis.org


Re: [CODE4LIB] REMINDER: Voting for Code4Lib 2014 Prepared Talks ends December 6th

2013-12-05 Thread Wick, Ryan
I've activated all of the new code4lib.org accounts I could find over the last 
couple weeks. If you registered at code4lib.org but the account has not been 
activated yet, let me know what your username is. Or if you have any other 
account login issues.

Ryan Wick
ryanw...@gmail.com

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Kevin 
Beswick
Sent: Tuesday, December 03, 2013 7:14 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: [CODE4LIB] REMINDER: Voting for Code4Lib 2014 Prepared Talks ends 
December 6th

The program committee would like to remind everyone to submit their votes for 
prepared talk proposals for Code4Lib 2014 in Raleigh, NC. Voting closes this 
Friday, December 6th by 11:59:59pm PDT.

To vote for prepared talks, visit: http://vote.code4lib.org/election/28

For more details about voting, see the original message below.

Thanks!
Code4Lib Program Committee

-- Forwarded message --
From: Trevor Thornton trevorthorn...@nypl.org
Date: Mon, Nov 18, 2013 at 10:12 AM
Subject: [CODE4LIB] Voting for Code4Lib 2014 Prepared Talks begins today!
To: CODE4LIB@listserv.nd.edu


The Code4Lib 2014 Program Committee is happy to announce that voting is now 
open for prepared talks.

To vote, visit http://vote.code4lib.org/election/28, review the proposals, and 
assign points to those presentations you would like to see on the program this 
year.

You will need to log in with your code4lib.org username and password in order 
to vote. If you have any issues with your account, please contact Ryan Wick at 
ryanw...@gmail.com.

*Voting will end on Friday, December 6, 2013 at 11:59:59 PM PDT.*

The 10 proposals with the most votes will be guaranteed a slot at the 
conference. Additional presentations will be selected by the Program Committee 
in an effort to ensure diversity in program content. Community votes will still 
weigh heavily in these decisions.

For more information about Code4Lib 2014, visit 
http://code4lib.org/conference/2014/.


--
Trevor Thornton
Senior Applications Developer, NYPL Labs The New York Public Library
phone: 212-621-0287
email: trevorthorn...@nypl.org