Re: [CODE4LIB] eebo [resolved and coolness!!]

2015-06-05 Thread Eric Lease Morgan
On Jun 5, 2015, at 8:10 AM, Eric Lease Morgan  wrote:

> Does anybody here have experience reading the SGML/XML files representing the 
> content of EEBO? 

I ultimately found the EEBO files in the form of TEI, and then I was able to 
transform one of them into VERY functional HTML5. Coolness! Here’s the recipe:

 1. download P5 from Box [1]
 2. download stylesheets from GitHub [2]
 3. transform using Saxon [3]
 4. save output to HTTP server 
 5. open in browser [4]
 6. read results AND get scanned image

Nice clean data + fully functional stylesheets = really cool output

[1] P5 - http://bit.ly/1QcvxLP
[2] stylesheets - https://github.com/TEIC/Stylesheets
[3] transform - java -cp saxon9he.jar net.sf.saxon.Transform -t 
-s:/var/www/html/sandbox/eebo-tcp/xml/A0/A06567.xml 
-xsl:/var/www/html/sandbox/eebo-tcp/style/html5/html5.xsl > 
/var/www/html/tmp/eebo.html
[4] output - http://dh.crc.nd.edu/tmp/eebo.html

—
ELM


[CODE4LIB] Job: Digital Resources Coordinator at Frontier Nursing University

2015-06-05 Thread jobs
Digital Resources Coordinator
Frontier Nursing University
Lexington, KY

We are seeking a Digital Resources Coordinator to
administer and maintain the digital systems of the Frontier Nursing University
Library. These systems include the institutional repository, library website,
and other web services. The Digital Resources Coordinator will also perform
other technical and public services task relative to library operation.

  
**Job Duties**  
  
_Institutional Repository_

  
 Develop and maintain
FNU's institutional repository software.

 Establish and
maintain consistent metadata for repository items.

 Upload content into
repository.

 Collect materials
for inclusion in repository.

 Digitize materials
as needed.

 Promote and provide
access to the repository to all user groups.

 Train and support
other contributors to the repository.

 Monitor repository
traffic and growth.

  
_Library Web Services_

  
 Administer the FNU
LibGuides account.

 Coordinate with IT
to update and maintain the library's website.

 Maintain the online
catalogue.

  
_Other_

  
 Catalogue items for
all collections.

 Assist in optimizing
technologies for information access and discovery.

 Keep current with
trends in digital libraries and institutional repositories.

 Assist with
maintenance of electronic resources.

 Assist with
answering online and telephone reference questions.

 Perform other
related duties as assigned.

  
**Qualifications / Experience**  
  
_Required_

  
 Master's degree in
Library and Information Science or related discipline

 Ability to work both
independently and as part of a team

 Ability to deal with
students, faculty and staff in a courteous manner and communicate with a
variety of professionals

 Knowledge of library
operations, services, and activities

  
_Preferred_

  
 Knowledge of HTML,
CSS, and other web-authoring tools

 Experience with
LibGuides

 Experience using
CONTENTdm

  
This position will be located in FNU's administrative office in Lexington with
occasional travel to the Hyden campus required.

  
Applications are being accepted until June 26, 2015.

  
Send a resume and cover letter to Billie Anne Gebb, Director of Library
Services, at billieanne.g...@frontier.edu



Brought to you by code4lib jobs: http://jobs.code4lib.org/job/21408/
To post a new job please visit http://jobs.code4lib.org/


Re: [CODE4LIB] eebo

2015-06-05 Thread Stuart A. Yeates
The recently released EEBO texts are available as TEI, I suggest you ask on
the TEI list.

If you want real vanilla htm like conversion, Tei-boilerplate is probably a
good place to start.

Cheers
Stuart

On Saturday, June 6, 2015, Eric Lease Morgan  wrote:

> On Jun 5, 2015, at 8:20 AM, Ethan Gruber  > wrote:
>
> >> Does anybody here have experience reading the SGML/XML files
> representing
> >> the content of EEBO?
> >
> > Are these in TEI? Back when I worked for the University of Virginia
> > Library, I did a lot of clean up work and migration of Chadwyck-Healey
> > stuff into TEI-P4 compliant XML (thousands of files), but unfortunately
> all
> > of the Perl scripts to migrate old garbage SGML into XML are probably
> gone.
> >
> > How many of these things are really worth keeping, i.e., were not
> digitized
> > by any other organization that has freely published them online?
>
>
> The data I have comes in two flavors: 1) some flavor of SGML, and 2) some
> flavor of XML which is TEI-like, but not TEI. All of the files are worth
> keeping because I get the basic bibliographic information (id, author,
> title, date, keywords/subjects), as well as transcribed text. (No images.)
> Given such data, I think I can provide interesting, cool, and “kewl”
> services. Given the id number, I may then be able to link to the scanned
> image. Wish me luck. —ELM
>


-- 
--
...let us be heard from red core to black sky


Re: [CODE4LIB] Bibframe and FRBRization

2015-06-05 Thread Harper, Cynthia
Thanks!

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Richard 
Wallis
Sent: Thursday, June 04, 2015 9:49 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Bibframe and FRBRization

Bibframe is a vocabulary which will enable the description of Works and their 
Instances - for the purpose of this conversation a Bibframe Instance 
approximates to a FRBR Manifestation.

The kind of service you describe would be one built upon such data.  It could 
be a specific ISBN to Work id lookup tool or it could be a general query tool 
such as a SPARQL server.  So the capability is there once the data [encoded 
using the Bibframe vocabulary] is available in sufficient quantity to make such 
a service viable.

If you are looking for unique Work identifiers (URIs) for related 
manifestations - there are approximately 200 million of them available from 
WorldCat.org.  Currently the best way to get one is by using the OCLC Number 
associated with your manifestation.

As Peter points out you can get more information here:
https://www.oclc.org/developer/develop/linked-data/worldcat-entities/worldcat-work-entity.en.html

If you want to capture the exampleOfWork through code, the Linked Data 
description of a manifestation is available in several serialisation forms,
not just html.   So for example http://www.worldcat.org/oclc/889647468 gets
you html  http://www.worldcat.org/oclc/889647468.ttl gives you Turtle, 
http://www.worldcat.org/oclc/889647468.jsonld gives you JSON, 
http://www.worldcat.org/oclc/889647468.rdf gives you RDF/XML and 
http://www.worldcat.org/oclc/889647468.nt gives you triples any of which you 
can parse to extract the exampleOfWork value from.

~Richard.

On 4 June 2015 at 14:35, Harper, Cynthia  wrote:

> Thanks - I didn't know about it.
> Cindy
>
> -Original Message-
> From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf 
> Of Boheemen, Peter van
> Sent: Thursday, June 04, 2015 9:33 AM
> To: CODE4LIB@LISTSERV.ND.EDU
> Subject: Re: [CODE4LIB] Bibframe and FRBRization
>
> Maybe not 'the Bibframe way', but i guess the only existing service 
> for that now would be the Worldcat Work description
>
> see:
>
>
>
> https://www.oclc.org/developer/develop/linked-data/worldcat-entities/w
> orldcat-work-entity.en.html
>
> Peter
>
>
> 
> Van: Code for Libraries  namens Harper, 
> Cynthia 
> Verzonden: donderdag 4 juni 2015 15:12
> Aan: CODE4LIB@LISTSERV.ND.EDU
> Onderwerp: [CODE4LIB] Bibframe and FRBRization
>
> I am fairly uninformed, but my understanding is that Bibframe is 
> designed to allow distinguishing between work and manifestation as in 
> FRBR.  Will there be some resource that we can send an ISBN for the 
> manifestation, and get back a permanent unique identifier for the work? (I 
> hope I've got my
> FRBR concepts straight)?   And if not now, when?  I'm in the
> thougt-experiment phase of planning a dataset that would be based on 
> works, and looking for a good identifier for it.
>
> Cindy Harper
> E-services and periodicals librarian
> Virginia Theological Seminary
> Bishop Payne Library
> 3737 Seminary Road
> Alexandria VA 22304
> char...@vts.edu
> 703-461-1794
>



--
Richard Wallis
Founder, Data Liberate
http://dataliberate.com
Tel: +44 (0)7767 886 005

Linkedin: http://www.linkedin.com/in/richardwallis
Skype: richard.wallis1
Twitter: @rjw


[CODE4LIB] LibGuides best practices

2015-06-05 Thread Jesse Martinez
Hi all,

At the recent NECode4Lib meetup there was some open discussion and interest
in sharing guide standards and/or best practices for LibGuides. I'd like to
share the LibGuides best practices guide we've recently put together at
Boston College.

http://libguides.bc.edu/guidestandards/

This guide is not meant to be exhaustive but mainly to address the top
issues we've seen across our library's content. There's a strong focus on
accessibility as we're going through a website redesign -- notice the Beta
stamp in the upper left-hand corner of the guide. (And accessibility issues
are a strong motivator to get folks to update their content!)

Feedback is welcome. I'd also be interested in other's best practices.

Thanks,

Jesse


Jesse Martinez
Web Services Librarian
O'Neill Library, Boston College
jesse.marti...@bc.edu
617-552-2509


[CODE4LIB] Balisage Symposium on Cultural Heritage Markup Program

2015-06-05 Thread Hugh Cayless
I am very pleased to announce that the program for the Symposium on Cultural 
Heritage Markup is now available: 
http://balisage.net/CulturalHeritage/symposiumProgram.html 
(the symposium will 
be followed by Balisage: The Markup Conference 2015 http://balisage.net/ 
)

All of the talks in some way address and grapple with the complexity of 
Cultural Heritage materials and the difficulties involved in applying standard 
markup solutions to them. A further call for short presentations meant to kick 
off long discussions will be forthcoming, so please be on the lookout!

Topics include: 
- dealing with document fragments and fragmentary annotations
- aligning text to images and the character level
- interoperable cross-references using TEI
- converting metadata from legacy formats to web formats
- representing and storing metadata for long-term use

Symposium Logistics: 
- When: 10 August 2015
- Where: Bethesda North Marriott Hotel & Conference Center
 (Bethesda, Maryland, a suburb of Washington, DC)
- More Info: http://balisage.net/CulturalHeritage/index.html 

- Program: http://balisage.net/CulturalHeritage/symposiumProgram.html 

- Registration: http://balisage.net/registration.html 

- Questions: i...@balisage.net 

==
Balisage: The Markup Conference 2015  mailto:i...@balisage.net 

August 11-14, 2015   http://www.balisage.net 

Preconference Symposium: August 10, 2015+1 301 315 9631
==


/**
 *  Hugh A. Cayless, Ph.D
 *  Chair, TEI Technical Council 
 *  Duke Collaboratory for Classics Computing (DC3)
 *  hugh.cayl...@duke.edu
 *  http://blogs.library.duke.edu/dcthree/
**/


[CODE4LIB] Job: Associate Director of Technology at Salt Lake County Library Services

2015-06-05 Thread jobs
Associate Director of Technology
Salt Lake County Library Services
Salt Lake City

Associate Director of Technology (Salt Lake County Library Services, Utah)

Salt Lake County Library Services is currently accepting applications for
Associate Director of Technology. Our 18 libraries are near the majestic
Wasatch Mountains and 16 ski resorts with access to diverse recreational and
cultural opportunities.The Salt Lake Valley is a great
place to live, work and play! Job duties, salary information and minimum
qualifications available at the Salt Lake County Job Listing
website.Online applications will be accepted through May
31, 2015.



Brought to you by code4lib jobs: http://jobs.code4lib.org/job/20995/
To post a new job please visit http://jobs.code4lib.org/


Re: [CODE4LIB] eebo

2015-06-05 Thread Eric Lease Morgan
On Jun 5, 2015, at 8:20 AM, Ethan Gruber  wrote:

>> Does anybody here have experience reading the SGML/XML files representing
>> the content of EEBO?
> 
> Are these in TEI? Back when I worked for the University of Virginia
> Library, I did a lot of clean up work and migration of Chadwyck-Healey
> stuff into TEI-P4 compliant XML (thousands of files), but unfortunately all
> of the Perl scripts to migrate old garbage SGML into XML are probably gone.
> 
> How many of these things are really worth keeping, i.e., were not digitized
> by any other organization that has freely published them online?


The data I have comes in two flavors: 1) some flavor of SGML, and 2) some 
flavor of XML which is TEI-like, but not TEI. All of the files are worth 
keeping because I get the basic bibliographic information (id, author, title, 
date, keywords/subjects), as well as transcribed text. (No images.) Given such 
data, I think I can provide interesting, cool, and “kewl” services. Given the 
id number, I may then be able to link to the scanned image. Wish me luck. —ELM


Re: [CODE4LIB] eebo

2015-06-05 Thread Owen Stephens
Hi Eric,

I’ve worked with EEBO as part of the Jisc Historical Texts 
(https://historicaltexts.jisc.ac.uk/home) platform - which provides access to 
EEBO and other collections for UK Universities. My work was around the metadata 
and search of metadata and full text and display of results. I was mainly 
looking at metadata but did some digging into the TEI files to see how the 
markup could be used to extract metadata (e.g. presence of illustrations in the 
text).

I was lucky (?!) enough to have access to the MARC records, but I did also do 
some work looking at the metadata included in the TEI files.

If there is anything I can help with I’d be happy to.

 The people who worked with the files in detail were a UK s/w development 
company Knowledge Integration (http://www.k-int.com/) - I can give you a 
contact there if that would be helpful.

Owen

Owen Stephens
Owen Stephens Consulting
Web: http://www.ostephens.com
Email: o...@ostephens.com
Telephone: 0121 288 6936

> On 5 Jun 2015, at 13:10, Eric Lease Morgan  wrote:
> 
> Does anybody here have experience reading the SGML/XML files representing the 
> content of EEBO? 
> 
> I’ve gotten my hands on approximately 24 GB of SGML/XML files representing 
> the content of EEBO (Early English Books Online). This data does not include 
> page images. Instead it includes metadata of various ilks as well as the 
> transcribed full text. I desire to reverse engineer the SGML/XML in order to: 
> 1) provide an alternative search/browse interface to the collection, and 2) 
> support various types of text mining services. 
> 
> While I am making progress against the data, it would be nice to learn of 
> other people’s experience so I do not not re-invent the wheel (too many 
> times). ‘Got ideas?
> 
> —
> Eric Lease Morgan
> University Of Notre Dame


Re: [CODE4LIB] eebo

2015-06-05 Thread Ethan Gruber
Are these in TEI? Back when I worked for the University of Virginia
Library, I did a lot of clean up work and migration of Chadwyck-Healey
stuff into TEI-P4 compliant XML (thousands of files), but unfortunately all
of the Perl scripts to migrate old garbage SGML into XML are probably gone.


How many of these things are really worth keeping, i.e., were not digitized
by any other organization that has freely published them online?

On Fri, Jun 5, 2015 at 8:10 AM, Eric Lease Morgan  wrote:

> Does anybody here have experience reading the SGML/XML files representing
> the content of EEBO?
>
> I’ve gotten my hands on approximately 24 GB of SGML/XML files representing
> the content of EEBO (Early English Books Online). This data does not
> include page images. Instead it includes metadata of various ilks as well
> as the transcribed full text. I desire to reverse engineer the SGML/XML in
> order to: 1) provide an alternative search/browse interface to the
> collection, and 2) support various types of text mining services.
>
> While I am making progress against the data, it would be nice to learn of
> other people’s experience so I do not not re-invent the wheel (too many
> times). ‘Got ideas?
>
> —
> Eric Lease Morgan
> University Of Notre Dame
>


[CODE4LIB] eebo

2015-06-05 Thread Eric Lease Morgan
Does anybody here have experience reading the SGML/XML files representing the 
content of EEBO? 

I’ve gotten my hands on approximately 24 GB of SGML/XML files representing the 
content of EEBO (Early English Books Online). This data does not include page 
images. Instead it includes metadata of various ilks as well as the transcribed 
full text. I desire to reverse engineer the SGML/XML in order to: 1) provide an 
alternative search/browse interface to the collection, and 2) support various 
types of text mining services. 

While I am making progress against the data, it would be nice to learn of other 
people’s experience so I do not not re-invent the wheel (too many times). ‘Got 
ideas?

—
Eric Lease Morgan
University Of Notre Dame