[CODE4LIB] Code4Lib Journal Issue 36 published

Peter Murray Thu, 20 Apr 2017 11:38:23 -0700

The new issue of the Code4Lib Journal is now available:

http://journal.code4lib.org/issues/issues/issue36


The table of contents is below.  As you are reading, also know that we are 
looking for editors to join the Code4Lib Journal Editorial Committee.  What 
does it mean to join the editorial committee?  Read more about our process and 
structure (http://journal.code4lib.org/process-and-structure) and/or ask one of 
the current members of the editorial committee 
(http://journal.code4lib.org/editorial-committee).  Interested?  Send a letter 
to [email protected] and address these two questions:

  1) What is your vision for the Code4Lib Journal? Why are you interested in it?

  2) How can you contribute to the Code4Lib Journal, i.e. what do you have to 
offer?

In the meantime, enjoy issue 36!


Editorial: Reflecting on the success and risks to the Code4Lib Journal
Peter E. Murray
Linked Data is People: Building a Knowledge Graph to Reshape the Library Staff 
Directory
Jason A. Clark and Scott W. H. Young
One of our greatest library resources is people. Most libraries have staff 
directory information published on the web, yet most of this data is trapped in 
local silos, PDFs, or unstructured HTML markup. With this in mind, the library 
informatics team at Montana State University (MSU) Library set a goal of 
remaking our people pages by connecting the local staff database to the Linked 
Open Data (LOD) cloud. In pursuing linked data integration for library staff 
profiles, we have realized two primary use cases: improving the search engine 
optimization (SEO) for people pages and creating network graph visualizations. 
In this article, we will focus on the code to build this library graph model as 
well as the linked data workflows and ontology expressions developed to support 
it. Existing linked data work has largely centered around machine-actionable 
data and improvements for bots or intelligent software agents. Our work 
demonstrates that connecting your staff directory to the LOD cloud can reveal 
relationships among people in dynamic ways, thereby raising staff visibility 
and bringing an increased level of understanding and collaboration potential 
for one of our primary assets: the people that make the library happen.
Recommendations for the application of Schema.org to aggregated Cultural 
Heritage metadata to increase relevance and visibility to search engines: the 
case of Europeana
Richard Wallis, Antoine Isaac, Valentine Charles, and Hugo Manguinhas
Europeana provides access to more than 54 million cultural heritage objects 
through its portal Europeana Collections. It is crucial for Europeana to be 
recognized by search engines as a trusted authoritative repository of cultural 
heritage objects. Indeed, even though its portal is the main entry point, most 
Europeana users come to it via search engines.
Europeana Collections is fuelled by metadata describing cultural objects, 
represented in the Europeana Data Model (EDM). This paper presents the research 
and consequent recommendations for publishing Europeana metadata using the 
Schema.org vocabulary and best practices. Schema.org html embedded metadata to 
be consumed by search engines to power rich services (such as Google Knowledge 
Graph). Schema.org is an open and widely adopted initiative (used by over 12 
million domains) backed by Google, Bing, Yahoo!, and Yandex, for sharing 
metadata across the web It underpins the emergence of new web techniques, such 
as so called Semantic SEO.
Our research addressed the representation of the embedded metadata as part of 
the Europeana HTML pages and sitemaps so that the re-use of this data can be 
optimized.
The practical objective of our work is to produce a Schema.org representation 
of Europeana resources described in EDM, being the richest as possible and 
tailored to Europeana’s realities and user needs as well the search engines and 
their users.
Autoload: a pipeline for expanding the holdings of an Institutional Repository 
enabled by ResourceSync
James Powell, Martin Klein and Herbert Van de Sompel
Providing local access to locally produced content is a primary goal of the 
Institutional Repository (IR). Guidelines, requirements, and workflows are 
among the ways in which institutions attempt to ensure this content is 
deposited and preserved, but some content is always missed. At Los Alamos 
National Laboratory, the library implemented a service called LANL Research 
Online (LARO), to provide public access to a collection of publicly shareable 
LANL researcher publications authored between 2006 and 2016. LARO exposed the 
fact that we have full text for only about 10% of eligible publications for 
this time period, despite a review and release requirement that ought to have 
resulted in a much higher deposition rate. This discovery motivated a new 
effort to discover and add more full text content to LARO. Autoload attempts to 
locate and harvest items that were not deposited locally, but for which 
archivable copies exist. Here we describe the Autoload pipeline prototype and 
how it aggregates and utilizes Web services including Crossref, SHERPA/RoMEO, 
and oaDOI as it attempts to retrieve archivable copies of resources. Autoload 
employs a bootstrapping mechanism based on the ResourceSync standard, a NISO 
standard for data replication and synchronization. We implemented support for 
ResourceSync atop the LARO Solr index, which exposes metadata contained in the 
local IR. This allowed us to utilize ResourceSync without modifying our IR. We 
close with a brief discussion of other uses we envision for our 
ResourceSync-Solr implementation, and describe how a new effort called 
Signposting can replace cumbersome screen scraping with a robust autodiscovery 
path to content which leverages the Web protocol.
Outside The Box: Building a Digital Asset Management Ecosystem for Preservation 
and Access
Andrew Weidner, Sean Watkins, Bethany Scott, Drew Krewer, Anne Washington, 
Matthew Richardson
The University of Houston (UH) Libraries made an institutional commitment in 
late 2015 to migrate the data for its digitized cultural heritage collections 
to open source systems for preservation and access: Hydra-in-a-Box, 
Archivematica, and ArchivesSpace. This article describes the work that the UH 
Libraries implementation team has completed to date, including open source 
tools for streamlining digital curation workflows, minting and resolving 
identifiers, and managing SKOS vocabularies. These systems, workflows, and 
tools, collectively known as the Bayou City Digital Asset Management System 
(BCDAMS), represent a novel effort to solve common issues in the digital 
curation lifecycle and may serve as a model for other institutions seeking to 
implement flexible and comprehensive systems for digital preservation and 
access.
Medici 2: A Scalable Content Management System for Cultural Heritage Datasets
Constantinos Sophocleous, Luigi Marini, Ropertos Georgiou, Mohammed Elfarargy, 
Kenton McHenry
Digitizing large collections of Cultural Heritage (CH) resources and providing 
tools for their management, analysis and visualization is critical to CH 
research. A key element in achieving the above goal is to provide user-friendly 
software offering an abstract interface for interaction with a variety of 
digital content types. To address these needs, the Medici content management 
system is being developed in a collaborative effort between the National Center 
for Supercomputing Applications (NCSA) at the University of Illinois at 
Urbana-Champaign, Bibliotheca Alexandrina (BA) in Egypt, and the Cyprus 
Institute (CyI). The project is pursued in the framework of European Project 
“Linking Scientific Computing in Europe and Eastern Mediterranean 2” 
(LinkSCEEM2) and supported by work funded through the U.S. National Science 
Foundation (NSF), the U.S. National Archives and Records Administration (NARA), 
the U.S. National Institutes of Health (NIH), the U.S. National Endowment for 
the Humanities (NEH), the U.S. Office of Naval Research (ONR), the U.S. 
Environmental Protection Agency (EPA) as well as other private sector efforts.
Medici is a Web 2.0 environment integrating analysis tools for the 
auto-curation of un-curated digital data, allowing automatic processing of 
input (CH) datasets, and visualization of both data and collections. It offers 
a simple user interface for dataset preprocessing, previewing, automatic 
metadata extraction, user input of metadata and provenance support, storage, 
archiving and management, representation and reproduction. Building on previous 
experience (Medici 1), NCSA, and CyI are working towards the improvement of the 
technical, performance and functionality aspects of the system. The current 
version of Medici (Medici 2) is the result of these efforts. It is a scalable, 
flexible, robust distributed framework with wide data format support (including 
3D models and Reflectance Transformation Imaging-RTI) and metadata 
functionality. We provide an overview of Medici 2’s current features supported 
by representative use cases as well as a discussion of future development 
directions
An Interactive Map for Showcasing Repository Impacts
Hui Zhang and Camden Lopez
Digital repository managers rely on usage metrics such as the number of 
downloads to demonstrate research visibility and impacts of the repositories. 
Increasingly, they find that current tools such as spreadsheets and charts are 
ineffective for revealing important elements of usage, including reader 
locations, and for attracting the targeted audiences. This article describes 
the design and development of a readership map that provides an interactive, 
near-real-time visualization of actual visits to an institutional repository 
using data from Google Analytics. The readership map exhibits the global 
impacts of a repository by displaying the city of every view or download 
together with the title of the scholarship being read and a hyperlink to its 
page in the repository. We will discuss project motivation and development 
issues such as authentication with Google API, metadata integration, 
performance tuning, and data privacy.

Peter

[CODE4LIB] Code4Lib Journal Issue 36 published

Reply via email to