Re: [CODE4LIB] CODE4LIB Digest - 10 Jun 2013 to 11 Jun 2013 (#2013-147)

2013-06-12 Thread David Talley
Thanks, Debra, for encouraging participants to report out. The distributed 
conversations are tough to summarize (based on my limited experience) but if 
they include good links, people can try to follow along at a distance. The seed 
conversations sound like they'd be worth the trouble!

--

Date:Tue, 11 Jun 2013 09:10:49 -0500
From:Debra Shapiro dsshap...@wisc.edu
Subject: Re: LITA/ALCTS Library Linked Data IG managed discussion at ALA Annual 
in Chicago

Hi Karen, and others who might be interested; apologies to those who are not

The problem with streaming is that, after Jackie's short presentation - which 
could be captured, and I will try -  it's going to be table discussions, and 
there might be 12 tables. So the noise level is going to be high, and we could 
only get fragments. We are going to ask table facilitators to post short 
messages to todaysmeet (http://todaysmeet.com/) about summarizing their table's 
talk. I will set up a room, and share the link to the transcript of those text 
messages. Folks might tweet as well; I'll establish some hash tag at the start 
of the session.

thanks for your interest,
debra

On Jun 10, 2013, at 1:06 PM, Karen Coyle wrote:

 Debra - this looks very interesting, and makes me wish I were going to be 
 there. But I'm not. If anyone in the audience is able to stream this, even 
 without great AV quality, please send a message to the list. And for those of 
 you who are going, could you brainstorm about informal streaming?

 Thanks,
 kc

 On Mon Jun 10 11:00:42 2013, Debra Shapiro wrote:
 Linked Data IG managed discussion at ALA Annual in Chicago

 When:
 Sunday, June 30, 2013
 8:30 am to 10:00 am

 Where:
 McCormick Place Convention Center, Room N129

 What:
 The LITA/ALCTS Library Linked Data Interest Group invites you to attend a 
 managed discussion on Sunday, June 30, from 8:30-10:00 AM, at the McCormick 
 Place Convention Center, Room N129. Jackie Shieh of George Washington 
 University, one of the BIBFRAME Early Experimenters (EEs - 
 http://bibframe.org/faq/#q13), will give a short presentation designed to 
 kick off table discussions, on her institution's experience converting MARC 
 data to BIBFRAME. Please contact Theo Gerontakos (t...@uw.edu) or Debra 
 Shapiro (dsshap...@wisc.edu) if you'd like to volunteer as a table 
 facilitator.

 http://ala13.ala.org/node/11059

 Questions? Please send to Debra Shapiro (dsshap...@wisc.edu), not the list

 thanks

 dsshap...@wisc.edu
 Debra Shapiro
 UW-Madison SLIS
 Helen C. White Hall, Rm. 4282
 600 N. Park St.
 Madison WI 53706
 608 262 9195
 mobile 608 712 6368
 FAX 608 263 4849

 --
 Karen Coyle
 kco...@kcoyle.net http://kcoyle.net
 ph: 1-510-540-7596
 m: 1-510-435-8234
 skype: kcoylenet

dsshap...@wisc.edu
Debra Shapiro
UW-Madison SLIS
Helen C. White Hall, Rm. 4282
600 N. Park St.
Madison WI 53706
608 262 9195
mobile 608 712 6368
FAX 608 263 4849

--

Date:Tue, 11 Jun 2013 09:16:07 -0600
From:Sam Popowich sam.popow...@ualberta.ca
Subject: Code4Lib YEG Meetup this Thursday

Apologies for cross-posting.

This is just a reminder that the 2nd Edmonton Code4Lib Meetup will take
place this Thursday, June 13th at the Underground Tap and Grill, 10004
Jasper Ave, Edmonton.

We'll be building on some of the ideas we had last time to start planning
an event for late summer or early fall.

Thanks,
Sam.

--

Sam Popowich

Discovery Systems Librarian

University of Alberta Library

Edmonton, Alberta

sam.popow...@ualberta.ca

780-492-5753

--

Date:Tue, 11 Jun 2013 20:13:07 -
From:j...@code4lib.org
Subject: Job: Manager, IT Infrastructure and Client Services  at Yale University

Library Information Technology

Yale University Library

New Haven, CT

Salary Grade: 25

Requisition: #21569BR

www.yale.edu

Schedule: Full-time (37.5 hours per week); Standard Work
Week (M-F, 8:30 - 5:00)

The University and the Library:

The Yale University Library, as one of the world's leading research libraries,
collects, organizes, preserves, and provides access to and services for a rich
and unique record of human thought and creativity. It
fosters intellectual growth and is a highly valued partner in the teaching and
research missions of Yale University and scholarly communities
worldwide. A distinctive strength is its rich spectrum of
resources, including more than 15 million volumes and information in all
media, ranging from ancient papyri to early printed books to electronic
databases. The Library is engaged in numerous digital initiatives designed to
provide access to a full array of scholarly information. Housed in 15
libraries, including Sterling Memorial, Beinecke, and Bass libraries, it
employs a dynamic, diverse, and innovative staff of over 500 who have the
opportunity to work with the highest caliber of faculty and students,
participate on committees, and who are involved in other areas of staff
development. For 

Re: [CODE4LIB] best way to make MARC files available to anyone

2013-06-12 Thread Owen Stephens
Putting the files on GitHub might be an option - free for public repositories, 
and 38Mb should not be a problem to host there

Owen

Owen Stephens
Owen Stephens Consulting
Web: http://www.ostephens.com
Email: o...@ostephens.com
Telephone: 0121 288 6936

On 12 Jun 2013, at 02:24, Dana Pearson dbpearsonm...@gmail.com wrote:

 I have crosswalked the Project Gutenberg RDF/DC metadata to MARC.  I would
 like to make these files available to any library that is interested.
 
 I thought that I would put them on my website via FTP but don't know if
 that is the best way.  Don't have an ftp client myself so was thinking that
 that may be now passé.
 
 I tried using Google Drive with access available via the link to two
 versions of the files, UTF8 and MARC8.  However, it seems that that is not
 a viable solution.  I can access the files with the URLs provided by
 setting the access to anyone with the URL but doesn't work for some of
 those testing it for me or with the links I have on my webpage..
 
 I have five folders with files of about 38 MB total.  I have separated the
 ebooks, audio books, juvenile content, miscellaneous and non-Latin scripts
 such as Chinese, Modern Greek.  Most of the content is in the ebook folder.
 
 I would like to make access as easy as possible.
 
 Google Drive seems to work for me.  Here's the link to my page with the
 links in case you would like to look at the folders.  Works for me but not
 for everyone who's tried it.
 
 http://dbpearsonmlis.com/ProjectGutenbergMarcRecords.html
 
 thanks,
 dana
 
 -- 
 Dana Pearson
 dbpearsonmlis.com


[CODE4LIB] Job: Information Technology Specialist (Web Archiving) at Library of Congress

2013-06-12 Thread jobs
The Library of Congress serves the Congress in fulfilling its duties and
preserves and promotes knowledge and creativity for the benefit of the
American people. It is the nation's oldest federal cultural institution and
the world's largest library, with more than 151 million items in its physical
collections (including books, manuscripts, prints, photos, film, video, and
sound recordings) and almost 20 million items online. Located primarily on
Capitol Hill in Washington, D.C., the Library is the home of the U.S.
Copyright Office, the Congressional Research Service (CRS), the Law Library of
Congress, and the National Library Service for the Blind and Physically
Handicapped.

  
The Information Technology Specialist (Data Specialist) provides technical
support to the Office of Strategic Initiatives in a wide variety of activities
related both to the transfer of digital data from sources outside and inside
the Library, and to the movement of data across multiple applications and data
stores within the Library. The data spans multiple content types including
text, graphic, photographic, moving image, cartographic, sound/audio, and
mixed media data, including website content. The position involves applying
technical expertise in digital data and digital data management, developing
and adhering to best practices for data transfer, as well as working
collaboratively with managers, technical staff, and subject matter experts
both inside and outside of the Library.

  
This position is located in the Office of Strategic Initiatives, Directorate
of the Associate Librarian for Strategic Initiatives.

  
The position description number for this position is 128812.

  
The salary range indicated reflects the locality pay adjustments for the
Washington, D.C., Metropolitan area.

  
The incumbent of this position will work a flextime work schedule.

  
This is a non-supervisory, bargaining unit position.

  
Relocation expenses will not be authorized for the person(s) selected under
this vacancy announcement.

  
KEY REQUIREMENTS

  
DUTIES:

  
Serves as a technical expert, trouble-shooter, or consultant in a team
transferring or moving digital content and the metadata related to the
content. Works with Library staff to plan, test and execute imports to, and
exports from, Library of Congress (LC) systems. Works with Library staff on
data management teams to define, develop, implement and monitor plans for
managing data sets through multiple phases of a digital life cycle in the
context of the LC environment. Works with Library staff on data management
teams and partner institutions to define and test efficient data movement
procedures, establish and conduct effective and robust operational processes,
and coordinate among stakeholders. Work is executed in an environment of
continual change, where digital content types and content sources are rapidly
expanding, and the supporting processes and technologies are in a state of
flux.

  
Analyzes formats, metadata, and packaging of data sets for a wide variety of
content types. Evaluates the characteristics of data sets in terms of the
requirements for LC content and metadata systems. Uses automated tools and
utilities to analyze, validate, edit, index, inventory, package, move and
document data sets.

  
Serves as liaison to appropriate staff inside the Library of congress as well
as outside institutions and partners to ensure the proper assessment of
incomplete or conflicting information, and to resolve complex or difficult
matters that arise in connection with the movement and processing of digital
data. Maintains ongoing relationships with technical staff at LC and partner
institutions.

  
QUALIFICATIONS REQUIRED:

  
Applicants must have had progressively responsible experience and training
sufficient in scope and quality to furnish them with an acceptable level of
the following knowledge, skills, and abilities to perform the duties of the
position without more than normal supervision.

  * Knowledge of computer languages, utilities, and access methods.
  * Ability to research and analyze technology problems, issues, and program 
requirements.**
  * Ability to plan and execute work.
  * Ability to communicate in writing.
  * Ability to communicate effectively orally.
  * No additional requirements to those listed above.
  
HOW YOU WILL BE EVALUATED:

The Library of Congress evaluates applicants through an applicant
questionnaire and a structured interview. Applicants may also be screened for
some jobs through licensing, certification, and/or education requirements, a
narrative/application review, and/or a preliminary telephone interview. The
knowledge, skills, and abilities (KSAs) that are marked with a double asterisk
(**) in the vacancy announcement and the applicant questionnaire are
considered the most critical for a position. To be considered for final
selection, applicants must demonstrate fully acceptable experience in these
designated KSAs in the 

[CODE4LIB] Job: Data Services Librarian at Rutgers University

2013-06-12 Thread jobs
DESCRIPTION/RESPONSIBILITIES: The Rutgers University Libraries seek a
librarian skilled in teaching and instruction to fill the position of Data
Services Librarian in the John Cotton Dana Library on the Newark Campus of
Rutgers, The State University of New Jersey. Reporting to the Assistant
Chancellor and Director of John Cotton Dana Library and under the direction of
the Head of Public Services for the Dana Library, the Data Services Librarian
position includes significant teaching and instruction activity as well as
faculty liaison responsibilities, particularly with disciplinary and
interdisciplinary research centers, for the analysis of large data sets and
the provision of support in the presentation of the results of research and
analysis. In support of the instructional mission, the Data Services Librarian
instructs faculty and students in the use of research data sets as well as use
of software for quantitative and qualitative analysis. The Data Services
Librarian also participates as a member of the Public Services team providing
research assistance to the library's diverse faculty and student users. This
is a tenure-track appointment and as a member of a university-wide faculty,
the Data Services Librarian is expected to routinely participate in system-
wide initiatives, committees, and task forces, and to actively pursue and
participate in research, publication, and in professional associations.

  
QUALIFICATIONS: Required: ALA-accredited master's degree in Library and
Information Science as well as an advanced degree in a social sciences
discipline; experience working with quantitative data and manipulation of
datasets and experience teaching the use of datasets and statistical software
to faculty, students and researchers; knowledge of statistical software such
as SAS, SPSS (PASW), Stata, or R; familiarity with major data resources
(ICPSR, Census, etc.); awareness of national issues and trends in academic
librarianship, and the ability and desire to meet tenure and promotion
requirements; and a demonstrated commitment to fostering diversity is
required. Candidates who have had successful experience in the design and
delivery of services for diverse populations will be given preference.
Desired: Experience with software for qualitative data analysis; experience
with relational databases; experience in a library environment, including
reference and public services ; and knowledge of XML and metadata standards
relevant to data. The successful candidate must be eligible to work in the
United States.

  
SALARY: Salary will be commensurate with qualifications and experience.

  
STATUS AND BENEFITS: Faculty status, calendar year appointment, retirement
plans, life and health insurance, prescription drug, dental and vision plans,
tuition remission, and 22 days annually.

  
LIBRARY PROFILE: The Rutgers University Libraries (RUL), comprising libraries
on the university's Camden, New Brunswick, and Newark campuses, all reporting
to the Vice President for Information Services and University Librarian,
operate as a unified library system with coordinated public, technical
services, and collection development programs including digital initiatives
and a pioneering institutional repository. The libraries have highly valued
staff of about 300 who are committed to develop innovations in access
services, information literacy and digital initiatives. RUL operates with a
budget of $28 million and outstanding collections especially in jazz and New
Jerseyana. Rutgers University Libraries are a member of ARL, CRL, Lyrasis,
Metro, NERL, and VALE, and use Sirsi Dynix and OCLC as primary bibliographic
utilities and Fedora repository software. In concert with the integration of
the University of Medicine and Dentistry of New Jersey with Rutgers
University, the libraries connected with those schools in Newark and New
Brunswick will become part of the Rutgers University Libraries system as of
July 1, 2013. Rutgers University is a member of the Association of American
Universities. Rutgers is also a member of the Committee on Institutional
Cooperation (CIC), the nation's premier higher education consortium of top
tier research institutions which includes Big Ten Conference members and the
University of Chicago.

  
The Newark Campus of Rutgers University is a doctoral-degree granting research
institution that is a leading education and research center. Classified as a
Carnegie Research Intensive institution, Rutgers-Newark offers 14 doctoral
programs: American studies, applied physics, biology, chemistry, criminal
justice, global affairs, integrative neuroscience,management, mathematical
sciences, nursing, psychology, public administration, and urban systems. For
more information go to the RUL Web site: www.libraries.rutgers.edu and to
learn about the Dana Library and Newark Campus go to:
library.newark.rutgers.edu. Rutgers is an ADVANCE Institution, committed to
increase diversity and the participation and advancement 

[CODE4LIB] Job: Metadata Analyst at Emory University

2013-06-12 Thread jobs
The Emory University Libraries seek an energetic, service-oriented and
collaborative professional to serve as theMetadata Analyst
for the Content Division in the Robert W Woodruff Library. The ideal candidate
will supportinitiatives that relate to digital scholarship,
digitization, special collections access, and other metadata-
dependentefforts to describe, manage, expose and share
collections with users.

  
Position Summary

Reporting to the Senior Director of the Content Division, the Metadata Analyst
supports initiatives that relate todigital scholarship,
digitization, special collections access, and other metadata dependent efforts
to describe,manage, expose and share collections with
users. Acting as an individual contributor, the incumbent may
alternatelylead projects or serve as a member of a project
team and provide metadata expertise. The Metadata Analyst
willinteract with curators, archivists, librarians,
technologists, researchers and students to learn about and
delivermetadata solutions for projects and programs. The
Metadata Analyst focuses on creating and normalizing
metadata,optimizing the interoperability of metadata among
systems, and leveraging metadata to increase discoverability
anduse of collections and monitors emerging technologies
and recommends their adoption if they meet project orlong-
term organizational goals.

  
Specific duties of the incumbent include:

  * Provides and anticipates metadata solutions for a wide variety projects, 
services, and stakeholders, chiefly in special collections, digital 
scholarship, and IT units.
  * Identifies, designs, and develops schemas, ontologies, 
taxonomies,vocabularies, etc. for images, sound, video, text, realia, graphics, 
data, geospatial data, etc.
  * Prototypes and develops automated services and applications for 
metadataextraction, creation, normalization, analysis, transformation, 
syndication, and ingest.
  * Integrates semantic, linked data, and other metadata analytical 
technologies withvarious existing digital asset management and discovery 
platforms.
  * Contributes to research and development of other metadata projects 
andinitiatives.
  * Develops training and documentation in support of metadata encoding and 
transformation for metadata librarians and catalogers.
  * Shares results of work with other staff through presentation and written 
documentation.
  * Facilitates meetings to learn about needs and to develop agreement and 
consensus.
  * Acts as chair of the University Libraries Metadata Working Group (MWG), 
providing leadership and direction developing and implementing best practices 
for metadata creation and management across the Emory Libraries.
  * Schedules meetings and sets agendas. Builds consensus through dialog and 
group problem-solving,
  * working with individuals and groups, to reach agreement.
  * Provides updates to Library Cabinet, the senior management group.
  * Oversees and guides the work of the Cataloging and Authorities Working 
Group, a subgroup of the University Libraries Metadata Working Group, which 
includes the cataloging department heads from all Emory University libraries.
  * Participates in library committees related to primary job assignment as 
appropriate.
  * Represents the library on university committees and task forces related to 
primaryjob assignment OR at the request of the Senior Vice Provost for Library 
Services  Digital Scholarship.
  * Serves on professional and scholarly association committees, task forces, 
work groups, and other entities at the local, state, regional, national, and 
international level as appropriate to position and area of expertise.
  * Participates in appropriate professional and scholarly associations and 
organizations including maintaining membership and/or accreditation; attending 
meetings, conferences, workshops; and serving in appointed or elected positions.
  * Presents on work-related topics and research at professional and scholarly 
conferences, symposia, and workshops. Publishes on work-related topics and 
research in professional and scholarly publications.
  * Maintains up-to-date professional knowledge and skills in areas related to 
primaryjob assignment as well as maintaining general knowledge of current 
trends in higher education, academic libraries, and information and educational 
technology.
Required Qualifications

  * ALA-accredited master's degree in Library and Information Science OR 
equivalenteducation and experience (subject expertise combined with appropriate 
industry experience and/or library experience).
  * Knowledge of basic administration, management and automation of 
variousContent Management Systems and installed software packages.
  * Technical expertise including: 
* 2+ years related experience with metadata schemas, XML, and XSLT.
* Knowledge of Semantic Web technologies (RDF, RDFS, OWL, SPARQL).
* Familiarity with semantic web W3C standards and ongoing efforts.
* Experience with 

Re: [CODE4LIB] best way to make MARC files available to anyone

2013-06-12 Thread Ross Singer
Or the Internet Archive, since there are also a whole bunch of other MARC dumps 
there.

-Ross.

On Jun 12, 2013, at 4:25 AM, Owen Stephens o...@ostephens.com wrote:

 Putting the files on GitHub might be an option - free for public 
 repositories, and 38Mb should not be a problem to host there
 
 Owen
 
 Owen Stephens
 Owen Stephens Consulting
 Web: http://www.ostephens.com
 Email: o...@ostephens.com
 Telephone: 0121 288 6936
 
 On 12 Jun 2013, at 02:24, Dana Pearson dbpearsonm...@gmail.com wrote:
 
 I have crosswalked the Project Gutenberg RDF/DC metadata to MARC.  I would
 like to make these files available to any library that is interested.
 
 I thought that I would put them on my website via FTP but don't know if
 that is the best way.  Don't have an ftp client myself so was thinking that
 that may be now passé.
 
 I tried using Google Drive with access available via the link to two
 versions of the files, UTF8 and MARC8.  However, it seems that that is not
 a viable solution.  I can access the files with the URLs provided by
 setting the access to anyone with the URL but doesn't work for some of
 those testing it for me or with the links I have on my webpage..
 
 I have five folders with files of about 38 MB total.  I have separated the
 ebooks, audio books, juvenile content, miscellaneous and non-Latin scripts
 such as Chinese, Modern Greek.  Most of the content is in the ebook folder.
 
 I would like to make access as easy as possible.
 
 Google Drive seems to work for me.  Here's the link to my page with the
 links in case you would like to look at the folders.  Works for me but not
 for everyone who's tried it.
 
 http://dbpearsonmlis.com/ProjectGutenbergMarcRecords.html
 
 thanks,
 dana
 
 -- 
 Dana Pearson
 dbpearsonmlis.com


Re: [CODE4LIB] best way to make MARC files available to anyone

2013-06-12 Thread Cary Gordon
I would put them on Dropbox or S3. The Dropbox free account is 5 GB.

Cary


On Wed, Jun 12, 2013 at 4:09 AM, Ross Singer rossfsin...@gmail.com wrote:

 Or the Internet Archive, since there are also a whole bunch of other MARC
 dumps there.

 -Ross.

 On Jun 12, 2013, at 4:25 AM, Owen Stephens o...@ostephens.com wrote:

  Putting the files on GitHub might be an option - free for public
 repositories, and 38Mb should not be a problem to host there
 
  Owen
 
  Owen Stephens
  Owen Stephens Consulting
  Web: http://www.ostephens.com
  Email: o...@ostephens.com
  Telephone: 0121 288 6936
 
  On 12 Jun 2013, at 02:24, Dana Pearson dbpearsonm...@gmail.com wrote:
 
  I have crosswalked the Project Gutenberg RDF/DC metadata to MARC.  I
 would
  like to make these files available to any library that is interested.
 
  I thought that I would put them on my website via FTP but don't know if
  that is the best way.  Don't have an ftp client myself so was thinking
 that
  that may be now passé.
 
  I tried using Google Drive with access available via the link to two
  versions of the files, UTF8 and MARC8.  However, it seems that that is
 not
  a viable solution.  I can access the files with the URLs provided by
  setting the access to anyone with the URL but doesn't work for some of
  those testing it for me or with the links I have on my webpage..
 
  I have five folders with files of about 38 MB total.  I have separated
 the
  ebooks, audio books, juvenile content, miscellaneous and non-Latin
 scripts
  such as Chinese, Modern Greek.  Most of the content is in the ebook
 folder.
 
  I would like to make access as easy as possible.
 
  Google Drive seems to work for me.  Here's the link to my page with the
  links in case you would like to look at the folders.  Works for me but
 not
  for everyone who's tried it.
 
  http://dbpearsonmlis.com/ProjectGutenbergMarcRecords.html
 
  thanks,
  dana
 
  --
  Dana Pearson
  dbpearsonmlis.com




-- 
Cary Gordon
The Cherry Hill Company
http://chillco.com


Re: [CODE4LIB] best way to make MARC files available to anyone

2013-06-12 Thread Dana Pearson
Thanks for the replies..I had looked at GitHub but thought it something
different, ie, collaborative software development...I will look again

hadn't thought of the Internet archive but that might be good and I'll take
a look at dropbox and Eric's other suggestions...altogether new to the
'cloud'

and regarding MARC records on the Gutenberg Project page...there is a new
feature that converts RDF/DC to MARC  but the download was small so I
suspect only recent additions...in fact, the necessary editing would remain
but may be useful for keeping my work up to date...I'll be interested to
see how it handles new line feeds in dc:title elements.

thanks again for the suggestions including Cary's that comes in as I type
this

dana




On Wed, Jun 12, 2013 at 6:09 AM, Ross Singer rossfsin...@gmail.com wrote:

 Or the Internet Archive, since there are also a whole bunch of other MARC
 dumps there.

 -Ross.

 On Jun 12, 2013, at 4:25 AM, Owen Stephens o...@ostephens.com wrote:

  Putting the files on GitHub might be an option - free for public
 repositories, and 38Mb should not be a problem to host there
 
  Owen
 
  Owen Stephens
  Owen Stephens Consulting
  Web: http://www.ostephens.com
  Email: o...@ostephens.com
  Telephone: 0121 288 6936
 
  On 12 Jun 2013, at 02:24, Dana Pearson dbpearsonm...@gmail.com wrote:
 
  I have crosswalked the Project Gutenberg RDF/DC metadata to MARC.  I
 would
  like to make these files available to any library that is interested.
 
  I thought that I would put them on my website via FTP but don't know if
  that is the best way.  Don't have an ftp client myself so was thinking
 that
  that may be now passé.
 
  I tried using Google Drive with access available via the link to two
  versions of the files, UTF8 and MARC8.  However, it seems that that is
 not
  a viable solution.  I can access the files with the URLs provided by
  setting the access to anyone with the URL but doesn't work for some of
  those testing it for me or with the links I have on my webpage..
 
  I have five folders with files of about 38 MB total.  I have separated
 the
  ebooks, audio books, juvenile content, miscellaneous and non-Latin
 scripts
  such as Chinese, Modern Greek.  Most of the content is in the ebook
 folder.
 
  I would like to make access as easy as possible.
 
  Google Drive seems to work for me.  Here's the link to my page with the
  links in case you would like to look at the folders.  Works for me but
 not
  for everyone who's tried it.
 
  http://dbpearsonmlis.com/ProjectGutenbergMarcRecords.html
 
  thanks,
  dana
 
  --
  Dana Pearson
  dbpearsonmlis.com




-- 
Dana Pearson
dbpearsonmlis.com


Re: [CODE4LIB] best way to make MARC files available to anyone

2013-06-12 Thread Owen Stephens
On 12 Jun 2013, at 14:06, Dana Pearson dbpearsonm...@gmail.com wrote:

 Thanks for the replies..I had looked at GitHub but thought it something
 different, ie, collaborative software development...I will look again

Yes - that's the main use (git is version control software, GitHub hosts git 
repositories) - but of course git doesn't care what types of files you have 
under version control. It came to mind because I know it's been used to 
distribute metadata files before - e.g. this set of metadata from the Cooper 
Hewitt National Design Museum https://github.com/cooperhewitt/collection

There could be some additional benefits gained through using git to version 
control this type of file, and GitHub to distribute them if you were 
interested, but it can act as simply a place to put the files and make them 
available for download. But of course the other suggestions would do this 
simpler task just as well.

Owen


[CODE4LIB] Registration open for iPRES 2013 / DC-2013

2013-06-12 Thread Angela Dappert
Registration open for iPRES 2013 / DC-2013



*Apologies for cross-posting*



=== iPRES 2013 / DC-2013 INVITATION TO REGISTER ===



Registration for iPRES-2013 is now open.

The conference will take place 2-6 September 2013 in Lisbon, Portugal at
the Instituto Superior Técnico.



Conference website: http://ipres2013.ist.utl.pt/index.html

Draft program: http://ipres2013.ist.utl.pt/prg_overview.html

The scientific program comprises 37 full and short papers, as well as 25
posters and demonstrations. There will be 8 tutorials, 12 workshops and a
doctoral symposium.



*Topics of interest include*:

•- Innovation in Digital Preservation: Novel Challenges and
Scenarios; Innovative Approaches; Preservation at Scale; Domain-specific
Challenges (Cultural Heritage, Technical and Scientific Processes and Data,
Engineering Models and Simulation, Medical Records, Corporate Processes and
Recordkeeping, Web Archiving, Personal Archiving, e-Procurement, etc.)

•- Systems Life-cycle: Specific Digital Preservation Requirements
and Implications in Modeling, Design, Development, Deployment and
Maintenance

•- Governance: Risk Analysis; Audit, Trust and Certification,
Trusted Repositories; Information/Data Quality

•- Business Models and Added-value of Digital Preservation:
Benefits Analysis, Emerging Exploitation Scenarios, Long-Tail of Digital
Preservation

•- Theory of Digital Preservation: Interdisciplinary Modeling,
Representation Concepts, Incentive Structures

•- Case Studies and Best Practices: Processes, Metadata, Systems,
Services, Infrastructures

•- Training and Education

iPRES-2013 will be collocated with DC-2013.  Both conferences will take
place in the same venue and run in parallel. During the collocated events,
delegates are welcome to choose sessions that best fit their interests from
either conference. Keynotes are held in common plenaries; and, social
events are shared, providing an excellent opportunity for iPRES and DCMI
delegates to socialize, share common interests and network. Delegates of
the two conferences may separately register for a mix of pre- and
post-conference events organized by the conference committees of both iPRES
and DCMI.



*Important Dates*:

• 08 July 2013: Deadline for early registration

• 02 September 2013: Tutorials sessions and Doctoral Symposium

• 03 September 2013: Conference starting…

• 05 September 2013: Conference closing (noon time)

• 05 September 2013: Workshops starting (afternoon)



€350 early regular, €250 early student (to 8 July)

€375 regular  student (after 8 July)

--Separate rates apply for pre-/post-conference sessions on Monday and
Friday

--Day rates are available



Registration questions? Contact for DC/iPRES 2013:  ipres2...@ist.utl.pt



We look forward to seeing you in Lisbon in September.



*Tutorials*:

T1.1 - Introduction to Linked Open Data (LOD)

T1.2 - Metadata Provenance

T1.3 - IGM: Maturidade da Governação da Informação
(Tutorial in Portuguese)

T1.4 - Build it, Share it, Keep it safe

T1.5 - Personal Digital Archiving

T1.6 - Islandora Institutional Repository Tutorial

T2.1 - Introduction to Ontology Concepts and Terminology

T2.2 - PROV - the W3C Provenance Ontology

T2.3 - IGM: Information Governance Maturity (Tutorial in
English)

T2.4 - Legal challenges in the preservation lifecycle – How
to address and how to solve them!

T2.5 - From Preserving Data to Preserving Research –
Curation of Process and Context

T2.6 - Tools for uncovering preservation risks in your
large repositories

T3.1 - Datasets, Open Data and Digital Preservation

T3.2 - Getting Started in Web Archiving and Web Archives
Preservation

*Workshops*

W1 - Digital Preservation Capabilities - How to assess and
improve capabilities in digital preservation?

W2 - PREMIS Implementation Fair Workshop

W3 - Archiving Community Memories

W4 - Cost of Curation

W5 - Preservation at Scale

W6 - Interoperability of Persistent Identifiers Systems –
services across PI domains

W7 – Open Research Challenges in Digital Preservation

W8 - CAMP-4-DATA

W9 -Vocabulary Day

 *iPREShack*: SPRUCE, CURATEcamp and OPF Hackathon (from September 2nd to
5th)



Please forward this email to anybody who might be interested!


Re: [CODE4LIB] best way to make MARC files available to anyone

2013-06-12 Thread Ford, Kevin
Hi Dana,

Out of curiosity, how does your crosswalk differ from Project Gutenberg's MARC 
files?  See, e.g.:

http://www.gutenberg.org/wiki/Gutenberg:Offline_Catalogs#MARC_Records_.28automatically_generated.29

Yours,
Kevin

--
Kevin Ford
Network Development and MARC Standards Office
Library of Congress 
Washington, DC



 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
 Dana Pearson
 Sent: Tuesday, June 11, 2013 9:24 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: [CODE4LIB] best way to make MARC files available to anyone
 
 I have crosswalked the Project Gutenberg RDF/DC metadata to MARC.  I
 would like to make these files available to any library that is
 interested.
 
 I thought that I would put them on my website via FTP but don't know if
 that is the best way.  Don't have an ftp client myself so was thinking
 that that may be now passé.
 
 I tried using Google Drive with access available via the link to two
 versions of the files, UTF8 and MARC8.  However, it seems that that is
 not a viable solution.  I can access the files with the URLs provided
 by setting the access to anyone with the URL but doesn't work for some
 of those testing it for me or with the links I have on my webpage..
 
 I have five folders with files of about 38 MB total.  I have separated
 the ebooks, audio books, juvenile content, miscellaneous and non-Latin
 scripts such as Chinese, Modern Greek.  Most of the content is in the
 ebook folder.
 
 I would like to make access as easy as possible.
 
 Google Drive seems to work for me.  Here's the link to my page with the
 links in case you would like to look at the folders.  Works for me but
 not for everyone who's tried it.
 
 http://dbpearsonmlis.com/ProjectGutenbergMarcRecords.html
 
 thanks,
 dana
 
 --
 Dana Pearson
 dbpearsonmlis.com


Re: [CODE4LIB] best way to make MARC files available to anyone

2013-06-12 Thread Ford, Kevin
Doh!

I read all the emails in the thread except for Eric's, which asked the same 
question.

Either way, his or mine, nevertheless curious.

Kevin

 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
 Eric Phetteplace
 Sent: Tuesday, June 11, 2013 10:57 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] best way to make MARC files available to anyone
 
 Dana - perhaps a public Dropbox folder? Or just put the files up on
 your site somewhere, served with a Content-Disposition: attachment
 header so they trigger a download when accessed? E.g. here's a
 StackOverflowhttp://stackoverflow.com/questions/9195304/how-to-use-
 content-disposition-for-force-a-file-to-download-to-the-hard-
 drivethread
 on that. If they must be a recognized MIME type, you could compress
 them as .zip or .tar.gz files on the server, which would reduce
 download time either way.
 
 I did try clicking the links on your site and they never downloaded,
 the request just timed out.
 
 Not to discredit what you're doing, which is great, but aren't MARC
 records already available for Project Gutenberg? See their offline
 catalogshttp://www.gutenberg.org/wiki/Gutenberg:Offline_Catalogs#MARC_
 Records_.28offsite.29page.
 
 Best,
 Eric Phetteplace
 Emerging Technologies Librarian
 Chesapeake College
 Wye Mills, MD
 
 
 On Tue, Jun 11, 2013 at 9:24 PM, Dana Pearson
 dbpearsonm...@gmail.comwrote:
 
  I have crosswalked the Project Gutenberg RDF/DC metadata to MARC.  I
  would like to make these files available to any library that is
 interested.
 
  I thought that I would put them on my website via FTP but don't know
  if that is the best way.  Don't have an ftp client myself so was
  thinking that that may be now passé.
 
  I tried using Google Drive with access available via the link to two
  versions of the files, UTF8 and MARC8.  However, it seems that that
 is
  not a viable solution.  I can access the files with the URLs provided
  by setting the access to anyone with the URL but doesn't work for
 some
  of those testing it for me or with the links I have on my webpage..
 
  I have five folders with files of about 38 MB total.  I have
 separated
  the ebooks, audio books, juvenile content, miscellaneous and non-
 Latin
  scripts such as Chinese, Modern Greek.  Most of the content is in the
 ebook folder.
 
  I would like to make access as easy as possible.
 
  Google Drive seems to work for me.  Here's the link to my page with
  the links in case you would like to look at the folders.  Works for
 me
  but not for everyone who's tried it.
 
  http://dbpearsonmlis.com/ProjectGutenbergMarcRecords.html
 
  thanks,
  dana
 
  --
  Dana Pearson
  dbpearsonmlis.com
 


[CODE4LIB] From Preserving Data to Preserving Research. Registration open for tutorial at TPDL 2013 22 Sep 2013

2013-06-12 Thread Angela Dappert
 Apologies for cross-posting ===

Tutorial to be held in connection with TPDL 2013, 22 September 2013,
Valletta, Malta http://www.tpdl2013.info/

From Preserving Data to Preserving Research: Curation of Process and
Context

 *ABSTRACT*

In the domain of eScience, investigations are increasingly collaborative.
Most scientific and engineering domains benefit from building on the
outputs of other research: by sharing information to reason over and data
to incorporate in the modeling task at hand. This raises the need for
preserving and sharing entire eScience workflows and processes for later
reuse. We need to define which information is to be collected, create means
to preserve it and approaches to enable and validate the re-execution of a
preserved process. This includes and goes beyond preserving the data used
in the experiments, as the process underlying its creation and use is
essential.

The TIMBUS project and Wf4Ever project team up for this half-day tutorial
to provide an introduction to the problem domain and discuss solutions for
the curation of eScience processes.
1. TUTORIAL LEVEL: Introductory level 2. DURATION: Half-day 3.
OUTLINE
OF THE CONTENT

The tutorial will cover the following topics:

*Introduction to Process and Context Preservation*: The introduction will
motivate the need for process and context preservation, illustrate how this
task is difficult in an evolving domain, and introduce a use case for the
rest of the tutorial to illustrate approaches and tools.

*Data Citation*: Data forms the basis of the results of many research
publications, and thus needs to be referenced with the same accuracy as
bibliographic data. Only if data can be identified with high precision can
it be reused, validated, verified and reproduced. Citing a specific data
set is however not trivial - it exists in a vast plurality of
specifications and instances, can potentially be huge in size, and its
location might change. We will provide an overview over existing approaches
to overcoming these challenges. Further, we will present the issue of
creating data citations of data held in databases, especially of dynamic
data sets where data is added or updated on a regular basis.

*Re-usability and traceability of workflows and processes*: The processes
creating and interpreting data are complex objects. Curating and preserving
them requires special effort, as they are dynamic, and highly dependent on
software, configuration, hardware, and other aspects. We will discuss these
issues in detail, and provide an introduction to two complementary
approaches.

The first approach is based on the concept of Research Objects, which
adopts a workflow-centric approach and thereby aims at facilitating the
reuse and reproducibility. It allows packaging the data and the methods as
one Research Object to share and cite it, and thus enable publishers to
grant access to the actual data and methods that contribute to the findings
reported in scholarly articles.

A second approach focuses on describing and preserving a process and the
context it is embedded in. The artifacts that may need to be captured range
from data, software and accompanying documentation, to legal and human
resource aspects. Some of this information can be automatically extracted
from an existing process, and tools for this will be presented. Ways to
archive the process and to perform preservation actions on the process
environment, such as recreating a controlled execution environment or
migration of software components, are presented. Finally, the challenge of
evaluating the re-execution of a preserved process is discussed, addressing
means of establishing its authenticity.
4. INTENDED AUDIENCE

The tutorial is targeted at researchers, publishers and curators in
eScience disciplines who want to learn about methods of ensuring the
long-term availability of experiments forming the basis of scientific
research.
5. EXPECTED LEARNING OUTCOMES

The tutorial participants will become understand

·   Motivations and challenges of process preservation

·   Motivations, stakeholders and challenges of making data citable

·   How Data is Cited Today: OECD [1] report on data citability, Google
search of data sets, requirements, guidelines, metadata, locators and
identifiers, approaches to naming schemes and properties.

·   Available technologies for identifiers: Archival Resource Key (ARK),
Digital Object Identifiers (DOI), Extensible Resource Identifier (XRI),
HANDLE, Life Science ID (LSID), Object Identifiers (OID), Persistent
Uniform Resource Locators (PURL), URI/URN/URL, Universally Unique Identifier
(UUID)

·   Approaches and Initiatives for citing data: CODATA, Data Cite,
OpenAire, challenges and opportunities: granularity, scalability,
complexity and evolving data sets current research questions

·   Ontologies needed to capture research objects: Core Ontology of the
RO family of vocabularies, workflow 

Re: [CODE4LIB] best way to make MARC files available to anyone

2013-06-12 Thread Daniel Lovins
If anyone from HathiTrust is watching this thread, I'd also be curious if
they're considering bulk record downloads via something other than OAI [1].

Thanks.

Daniel
[1] http://www.lib.umich.edu/michigan-digitization-project-oai-harvesting


-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
Ford, Kevin
Sent: Wednesday, June 12, 2013 10:12 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] best way to make MARC files available to anyone

Doh!

I read all the emails in the thread except for Eric's, which asked the same
question.

Either way, his or mine, nevertheless curious.

Kevin

 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf
 Of Eric Phetteplace
 Sent: Tuesday, June 11, 2013 10:57 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] best way to make MARC files available to
 anyone

 Dana - perhaps a public Dropbox folder? Or just put the files up on
 your site somewhere, served with a Content-Disposition: attachment
 header so they trigger a download when accessed? E.g. here's a
 StackOverflowhttp://stackoverflow.com/questions/9195304/how-to-use-
 content-disposition-for-force-a-file-to-download-to-the-hard-
 drivethread
 on that. If they must be a recognized MIME type, you could compress
 them as .zip or .tar.gz files on the server, which would reduce
 download time either way.

 I did try clicking the links on your site and they never downloaded,
 the request just timed out.

 Not to discredit what you're doing, which is great, but aren't MARC
 records already available for Project Gutenberg? See their offline
 catalogshttp://www.gutenberg.org/wiki/Gutenberg:Offline_Catalogs#MARC
 _
 Records_.28offsite.29page.

 Best,
 Eric Phetteplace
 Emerging Technologies Librarian
 Chesapeake College
 Wye Mills, MD


 On Tue, Jun 11, 2013 at 9:24 PM, Dana Pearson
 dbpearsonm...@gmail.comwrote:

  I have crosswalked the Project Gutenberg RDF/DC metadata to MARC.  I
  would like to make these files available to any library that is
 interested.
 
  I thought that I would put them on my website via FTP but don't know
  if that is the best way.  Don't have an ftp client myself so was
  thinking that that may be now passé.
 
  I tried using Google Drive with access available via the link to two
  versions of the files, UTF8 and MARC8.  However, it seems that that
 is
  not a viable solution.  I can access the files with the URLs
  provided by setting the access to anyone with the URL but doesn't
  work for
 some
  of those testing it for me or with the links I have on my webpage..
 
  I have five folders with files of about 38 MB total.  I have
 separated
  the ebooks, audio books, juvenile content, miscellaneous and non-
 Latin
  scripts such as Chinese, Modern Greek.  Most of the content is in
  the
 ebook folder.
 
  I would like to make access as easy as possible.
 
  Google Drive seems to work for me.  Here's the link to my page with
  the links in case you would like to look at the folders.  Works for
 me
  but not for everyone who's tried it.
 
  http://dbpearsonmlis.com/ProjectGutenbergMarcRecords.html
 
  thanks,
  dana
 
  --
  Dana Pearson
  dbpearsonmlis.com
 


[CODE4LIB] Job: Web Developer at East Carolina University

2013-06-12 Thread jobs
We're looking for a talented, enthusiastic team player to join our ranks and
help us to elevate the web presence of the ECU Libraries.
Please share with anyone that you think might be interested
in joining us! 

___

  
The Application  Discovery Services (ADS) unit supports the web and software
need of the ECU (East Carolina University) Libraries. In addition to
supporting the libraries' websites, ADS works collaboratively with all
departments to support project requests including but not limited to custom
application development, maintenance of data repositories, software
installation and configuration, and overall technical support.

  
This employee supports the planning, development, design, testing, and
maintenance of a wide range of web and software applications used by the
libraries. The person in this position workes collaboratively with other team
members to support existing applications, in addition to operating
independently on new project development. This person will use PHP,
JAvaScript, .NET, CommonSpot, XML/XSLT, CSS, AJAX, and other related
technologies as needed to maintain and create web applications for both
internal and external audiences. It is the responsibility of this individual
to provide customization and support for open source applications, implement
workflows to extract transform, and repurpose data, and provide integration
with vendor-based APIs and web services components. In consultation with the
lead developer, this individual determines project needs, prepares working
mockups, installs and configures software applications, troubleshoots issues,
and manages databases such as SQL and MYSQL.

  
Associate's degree in computer science, information technology, or related
discipline and one year of experience in the information technology field
related to the area of assignment; or Bachelor's degree and one year of
experience in the information technology field related to the area of
assignment; or Bachelor's degree in computer science, information technology,
or related discipline; or equivalent combination of training and experience.
All degrees must be received from appropriately accredited institutions.

  
Preferred: Bachelor's degree and three to five years experience.

  
Review of applications will start on 6/26/2013.

  
  



Brought to you by code4lib jobs: http://jobs.code4lib.org/job/8322/


Re: [CODE4LIB] best way to make MARC files available to anyone

2013-06-12 Thread Dana Pearson
Kevin,

don't know yet since don't know how to unzip the file...bz2?...in any case,
I'm guessing that there is no post transformation editing that most
libraries would insist upon...eg, subject headings in the metadata are
strings with hyphens separating subjects from subheadings and spatial,
temporal, genre subfields have to be introduced...some content needs to go
into 600,610, 611,630,651 fields...for more on the post transform editing
see:

http://dbpearsonmlis.com/GPmetadata.html

dana


On Wed, Jun 12, 2013 at 9:10 AM, Ford, Kevin k...@loc.gov wrote:

 Hi Dana,

 Out of curiosity, how does your crosswalk differ from Project Gutenberg's
 MARC files?  See, e.g.:


 http://www.gutenberg.org/wiki/Gutenberg:Offline_Catalogs#MARC_Records_.28automatically_generated.29

 Yours,
 Kevin

 --
 Kevin Ford
 Network Development and MARC Standards Office
 Library of Congress
 Washington, DC



  -Original Message-
  From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
  Dana Pearson
  Sent: Tuesday, June 11, 2013 9:24 PM
  To: CODE4LIB@LISTSERV.ND.EDU
  Subject: [CODE4LIB] best way to make MARC files available to anyone
 
  I have crosswalked the Project Gutenberg RDF/DC metadata to MARC.  I
  would like to make these files available to any library that is
  interested.
 
  I thought that I would put them on my website via FTP but don't know if
  that is the best way.  Don't have an ftp client myself so was thinking
  that that may be now passé.
 
  I tried using Google Drive with access available via the link to two
  versions of the files, UTF8 and MARC8.  However, it seems that that is
  not a viable solution.  I can access the files with the URLs provided
  by setting the access to anyone with the URL but doesn't work for some
  of those testing it for me or with the links I have on my webpage..
 
  I have five folders with files of about 38 MB total.  I have separated
  the ebooks, audio books, juvenile content, miscellaneous and non-Latin
  scripts such as Chinese, Modern Greek.  Most of the content is in the
  ebook folder.
 
  I would like to make access as easy as possible.
 
  Google Drive seems to work for me.  Here's the link to my page with the
  links in case you would like to look at the folders.  Works for me but
  not for everyone who's tried it.
 
  http://dbpearsonmlis.com/ProjectGutenbergMarcRecords.html
 
  thanks,
  dana
 
  --
  Dana Pearson
  dbpearsonmlis.com




-- 
Dana Pearson
dbpearsonmlis.com


Re: [CODE4LIB] best way to make MARC files available to anyone

2013-06-12 Thread Dana Pearson
Kevin, Eric

7zip worked fine to unzip and records look pretty good since they used 653
and preserved the string from the metadata element with the hypens.
 However the records do not do subfield d in 100 or 700 fields and thus
such content appears in the 245$c.  245$a seems to go missing with some
frequency.  MarcEdit does not report any errors though.

My original intent was just to keep my XSLT skills sharp while I had some
free time last August.  After creating the stylesheet, I then had no free
time until January when I could devote 2 or 3 hours to the post transform
editing.  Thought I'd just dive in but the pool was much deeper than I had
anticipated.

Do think libraries will prefer my edited versions although different in
non-access points as well.  Incidentally, not many additions since my
harvest.

First record in the Project Gutenberg produced records:

=LDR  00721cam a22002293a 4500
=001  27384
=003  PGUSA
=008  081202s2008xxu|s|000\|\eng\d
=040  \\$aPGUSA$beng
=042  \\$adc
=050  \4$aPQ
=100  1\$aDumas, Alexandre, 1802-1870
=245  10$a$h[electronic resource] /$cby Alexandre, 1802-1870 Dumas
=260  \\$bProject Gutenberg,$c2008
=500  \\$aProject Gutenberg
=506  \\$aFreely available.
=516  \\$aElectronic text
=653  \0$aFrance -- History -- Regency, 1715-1723 -- Fiction
=653  \0$aOrléans, Philippe, duc d', 1674-1723 -- Fiction
=830  \0$aProject Gutenberg$v27384
=856  40$uhttp://www.gutenberg.org/etext/27384
=856  42$uhttp://www.gutenberg.org/license$3Rights

couldn't readily find the above item but here's an example of my records by
the same author.

=LDR  01002nam a22002535  4500
=001  PG18997
=006  md
=007  cr||n\|||muaua
=008  \\s2006utu|o|||eng\d
=042  \\$adc
=090  \\$aPQ
=092  \0$aeBooks
=100  1\$aDumas, Alexandre,$d1802-1870.
=245  14$aThe Vicomte de Bragelonne$h[electronic resource] :$bOr Ten Years
Later being the completion of The Three Musketeers And Twenty Years
After /$Alexandre Dumas.
=260  \\$aSalt Lake City :$bProject Gutenberg Literary Archive
Foundation,$c2006.
=300  \\$a1 online resource :$bmultiple file formats.
=500  \\$aRecords generated from Project Gutenberg RDF data.
=540  \\$aApplicable license:$uhttp://www.gutenberg.org/license
=650  \0$aAdventure stories.
=650  \0$aHistorical fiction.
=651  \0$aFrance$vHistory$yLouis XIV, 1643-1715$vFiction.
=655  \0$aElectronic books.
=710  2\$aProject Gutenberg.
=856  40$uhttp://www.gutenberg.org/etext/18997$zClick to access.

thanks for your interest..

regards,
dana


On Wed, Jun 12, 2013 at 9:10 AM, Ford, Kevin k...@loc.gov wrote:

 Hi Dana,

 Out of curiosity, how does your crosswalk differ from Project Gutenberg's
 MARC files?  See, e.g.:


 http://www.gutenberg.org/wiki/Gutenberg:Offline_Catalogs#MARC_Records_.28automatically_generated.29

 Yours,
 Kevin

 --
 Kevin Ford
 Network Development and MARC Standards Office
 Library of Congress
 Washington, DC



  -Original Message-
  From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
  Dana Pearson
  Sent: Tuesday, June 11, 2013 9:24 PM
  To: CODE4LIB@LISTSERV.ND.EDU
  Subject: [CODE4LIB] best way to make MARC files available to anyone
 
  I have crosswalked the Project Gutenberg RDF/DC metadata to MARC.  I
  would like to make these files available to any library that is
  interested.
 
  I thought that I would put them on my website via FTP but don't know if
  that is the best way.  Don't have an ftp client myself so was thinking
  that that may be now passé.
 
  I tried using Google Drive with access available via the link to two
  versions of the files, UTF8 and MARC8.  However, it seems that that is
  not a viable solution.  I can access the files with the URLs provided
  by setting the access to anyone with the URL but doesn't work for some
  of those testing it for me or with the links I have on my webpage..
 
  I have five folders with files of about 38 MB total.  I have separated
  the ebooks, audio books, juvenile content, miscellaneous and non-Latin
  scripts such as Chinese, Modern Greek.  Most of the content is in the
  ebook folder.
 
  I would like to make access as easy as possible.
 
  Google Drive seems to work for me.  Here's the link to my page with the
  links in case you would like to look at the folders.  Works for me but
  not for everyone who's tried it.
 
  http://dbpearsonmlis.com/ProjectGutenbergMarcRecords.html
 
  thanks,
  dana
 
  --
  Dana Pearson
  dbpearsonmlis.com




-- 
Dana Pearson
dbpearsonmlis.com


Re: [CODE4LIB] best way to make MARC files available to anyone

2013-06-12 Thread Dana Pearson
quick followup on the thread..

github:  I looked at the cooperhewitt collection but don't see a way to
download the content...I could copy and paste their content but that may
not be the best approach for my files...documentation is thin, seems i
would have to provide email addresses for those seeking access...but
clearly that is not the case with how the cooperhewitt archive is
configured..

My primary concern has been to make it as simple a process as possible for
libraries which have limited technical expertise.  One of the reasons I
made a career change was my inability as a library director to integrate
very useful online resources in the library's content discovery system.
 Each of the libraries I led lacked expertise and/or the technical support
necessary to do so.  So, quit my job, re-tooled and now working
independently.

Internet Archive:  I did a search that included a query term MARC and found
the Open Library and this may be the best option but I will have to include
a field in each record I think...something I could easilydo...the marc
records do download nicely...I'll send a message for guidance on this

Eric's suggestion regarding MIME type is interesting as well but seems I
would have to have a recognizable type like zip...would prefer to have the
files no larger than 4000 or so records to facilitate processing...there
are also some content libraries may not want...eg, erotic literature,
juvenile content..

found the file for comparison with GP generated MARC:

=LDR  00945nam a22002535  4500
=001  PG27384
=006  md
=007  cr||n\|||muaua
=008  \\s2008utu|o|||eng\d
=042  \\$adc
=090  \\$aPQ
=092  \0$aeBooks
=100  1\$aDumas, Alexandre,$d1802-1870.
=240  14$aUne fille du régent.$lEnglish
=245  14$aThe Regent's Daughter$h[electronic resource] /$cAlexandre Dumas.
=260  \\$aSalt Lake City :$bProject Gutenberg Literary Archive
Foundation,$c2008.
=300  \\$a1 online resource :$bmultiple file formats.
=500  \\$aRecords generated from Project Gutenberg RDF data.
=540  \\$aApplicable license:$uhttp://www.gutenberg.org/license
=600  10$aOrléans, Philippe,$cduc d',$d1674-1723$vFiction.
=651  \0$aFrance$xHistory$yRegency, 1715-1723$vFiction.
=655  \0$aElectronic books.
=710  2\$aProject Gutenberg.
=856  40$uhttp://www.gutenberg.org/etext/27384$zClick to access.

Gutenberg Project MARC:

=LDR  00721cam a22002293a 4500
=001  27384
=003  PGUSA
=008  081202s2008xxu|s|000\|\eng\d
=040  \\$aPGUSA$beng
=042  \\$adc
=050  \4$aPQ
=100  1\$aDumas, Alexandre, 1802-1870
=245  10$a$h[electronic resource] /$cby Alexandre, 1802-1870 Dumas
=260  \\$bProject Gutenberg,$c2008
=500  \\$aProject Gutenberg
=506  \\$aFreely available.
=516  \\$aElectronic text
=653  \0$aFrance -- History -- Regency, 1715-1723 -- Fiction
=653  \0$aOrléans, Philippe, duc d', 1674-1723 -- Fiction
=830  \0$aProject Gutenberg$v27384
=856  40$uhttp://www.gutenberg.org/etext/27384
=856  42$uhttp://www.gutenberg.org/license$3Rights

thanks again,
dana


On Wed, Jun 12, 2013 at 6:19 PM, Dana Pearson dbpearsonm...@gmail.comwrote:

 Kevin, Eric

 7zip worked fine to unzip and records look pretty good since they used 653
 and preserved the string from the metadata element with the hypens.
  However the records do not do subfield d in 100 or 700 fields and thus
 such content appears in the 245$c.  245$a seems to go missing with some
 frequency.  MarcEdit does not report any errors though.

 My original intent was just to keep my XSLT skills sharp while I had some
 free time last August.  After creating the stylesheet, I then had no free
 time until January when I could devote 2 or 3 hours to the post transform
 editing.  Thought I'd just dive in but the pool was much deeper than I had
 anticipated.

 Do think libraries will prefer my edited versions although different in
 non-access points as well.  Incidentally, not many additions since my
 harvest.

 First record in the Project Gutenberg produced records:

 =LDR  00721cam a22002293a 4500
 =001  27384
 =003  PGUSA
 =008  081202s2008xxu|s|000\|\eng\d
 =040  \\$aPGUSA$beng
 =042  \\$adc
 =050  \4$aPQ
 =100  1\$aDumas, Alexandre, 1802-1870
 =245  10$a$h[electronic resource] /$cby Alexandre, 1802-1870 Dumas
 =260  \\$bProject Gutenberg,$c2008
 =500  \\$aProject Gutenberg
 =506  \\$aFreely available.
 =516  \\$aElectronic text
 =653  \0$aFrance -- History -- Regency, 1715-1723 -- Fiction
 =653  \0$aOrléans, Philippe, duc d', 1674-1723 -- Fiction
 =830  \0$aProject Gutenberg$v27384
 =856  40$uhttp://www.gutenberg.org/etext/27384
 =856  42$uhttp://www.gutenberg.org/license$3Rights

 couldn't readily find the above item but here's an example of my records
 by the same author.

 =LDR  01002nam a22002535  4500
 =001  PG18997
 =006  md
 =007  cr||n\|||muaua
 =008  \\s2006utu|o|||eng\d
 =042  \\$adc
 =090  \\$aPQ
 =092  \0$aeBooks
 =100  1\$aDumas, Alexandre,$d1802-1870.
 =245  14$aThe Vicomte de Bragelonne$h[electronic