Re: [CODE4LIB] pdf2txt [tesseract]

2013-10-17 Thread Eric Lease Morgan
On Oct 16, 2013, at 10:56 AM, Robert Haschart rh...@virginia.edu wrote: The abstract extraction routine I have been working on does use tesseract internally for doing OCR when it encounters a document that doesn't have usable full-text. I agree that tesseract is not that easy to install,

[CODE4LIB] Online validator for RelaxNG or Schematron?

2013-10-17 Thread Wolfe, Mark D
Does anyone know of an online validator for either Relax NG or Schematron? Thanks, Mark Mark Wolfe Curator of Digital Collections M. E. Grenander Department of Special Collections Archives Science Library 355, University at Albany, SUNY 1400 Washington Avenue, Albany NY 1 Phone: (518)

Re: [CODE4LIB] pdf2txt [tesseract]

2013-10-17 Thread Christian Pietsch
Hi Eric, On Thu, Oct 17, 2013 at 09:43:04AM -0400, Eric Lease Morgan wrote: Robert, can you outline the process you used to get Tesseract to do OCR agains PDF documents? I installed Tesseract a few months ago, but I couldn't figure out how to get to work against PDF, only some image files.

Re: [CODE4LIB] Google Analytics on multiple systems

2013-10-17 Thread Josh Wilson
Hi Joel, It usually ends up being easiest to go with one GA account, separating different sources by using different properties (e.g., UA-[acct number]-1 for CONTENTdm, UA-[acct number]-2 for LibGuides, etc.) rather than separate accounts entirely. Each property can have different users with

Re: [CODE4LIB] Google Analytics on multiple systems

2013-10-17 Thread Joel Marchesoni
Thank you all for your replies. I'm thinking we'll go with one account (we already have a Google account for various other services) with multiple properties. One thing that has complicated matters is the property we currently use is not yet able to be upgraded to Universal Analytics, which is

Re: [CODE4LIB] pdf2txt [tesseract]

2013-10-17 Thread Robert Haschart
On 10/17/2013 9:43 AM, Eric Lease Morgan wrote: On Oct 16, 2013, at 10:56 AM, Robert Haschartrh...@virginia.edu wrote: The abstract extraction routine I have been working on does use tesseract internally for doing OCR when it encounters a document that doesn't have usable full-text. I agree

Re: [CODE4LIB] MARC field lengths

2013-10-17 Thread Karen Coyle
Thanks, Bill. What you say about assumptions is a good part of what is motivating me to try to instigate a discussion. As you know, both FRBR and RDA were developed by the cataloging community with no input from technologists. There are sweeping statements about FRBR being more efficient than

Re: [CODE4LIB] Google Analytics on multiple systems

2013-10-17 Thread Josh Wilson
Wow, 250,000? I'm not sure that's right, though I'm prepared to believe anything. I checked the GA documentation, which says you can officially have 50 profiles per account. Each property has at least one default profile, so that's probably the official limit of properties too, before you'd need

[CODE4LIB] Call for Proposals: MARC Formats Transition Interest Group at ALA Midwinter

2013-10-17 Thread Sarah Weeks
**Apologies for cross posting** -- The LITA/ALCTS Marc Formats Transition Interest Group invites proposals for presentations for its session at the 2014 ALA Midwinter Conference in Philadelphia , Pennsylvania. The meeting will take place on Saturday, January 25th, from 3pm

Re: [CODE4LIB] Google Analytics on multiple systems

2013-10-17 Thread Joel Marchesoni
Oh wow, sorry, that's not right. I was thinking 25; not sure where the 4 zeros came from... Joel -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Josh Wilson Sent: Thursday, October 17, 2013 11:18 To: CODE4LIB@LISTSERV.ND.EDU Subject: Re:

[CODE4LIB] Job: Digital Initiatives Librarian at University of Wisconsin-Parkside

2013-10-17 Thread jobs
The University of Wisconsin-Parkside invites applications for the Digital Initiatives Librarian (Official Title: Associate Academic Librarian). This is a full-time, 12 month Academic Staff position. The Digital Initiatives Librarian manages the digital assets and digital collections of the

[CODE4LIB] Job: Professor of Audiovisual Archival Studies at University of California, Los Angeles

2013-10-17 Thread jobs
Assistant/Associate Professor of Audiovisual Archival Studies The Department of Information Studies of the Graduate School of Education and Information Studies at UCLA invites applications for a tenure-track assistant professor or tenured associate professor specializing in audiovisual archival

[CODE4LIB] Job: University Archivist Special Collections Librarian, at Adelphi University

2013-10-17 Thread jobs
University Archives and Special Collections (UASC) is comprised of two distinct collections--the official archives of the University, in multiple formats, and some 30 distinctive special collections in a variety of different subjects. Reporting to the Dean of Libraries, the University

[CODE4LIB] Job: Curator - Gordon W. Prange Collection and Librarian for East Asian Studies at University of Maryland, College Park

2013-10-17 Thread jobs
The University of Maryland Libraries are seeking dynamic and innovative applicants for the position of Curator of the Gordon W. Prange Collection and Librarian for East Asian Studies. The successful candidate will create and implement a vision for the Gordon W. Prange Collection, a

[CODE4LIB] Job: Systems Engineers at Virginia Polytechnic Institute and State University

2013-10-17 Thread jobs
Virginia Tech's Newman Library and the Center for Digital Research and Scholarship (CDRS) are seeking qualified candidates for two Systems Engineers for data initiatives. Incumbents will develop systems that: 1) enable data integration across distributed and heterogeneous local and external data

[CODE4LIB] Job: Library Web Manager at Brown University

2013-10-17 Thread jobs
Brown University Library seeks a Library Web Manager to oversee and manage content and software tools to support the Library's web presence. The Library Web Manager will coordinate with stakeholders across the Library to administer the Library's content management system and ensure consistency and

Re: [CODE4LIB] Online validator for RelaxNG or Schematron?

2013-10-17 Thread Barnes, Hugh
For RNG, as long as your schema is reachable and referenced correctly, it looks like this should work: http://validator.nu . Please let us know how you find it. Nothing known or found in a quick scan for Schematron. Somewhat surprised at the apparent lack of options. Cheers Hugh Barnes

[CODE4LIB] Code4Lib 2014 Call for Proposals

2013-10-17 Thread Ranti Junus
Code4lib 2014 is a loosely-structured conference that provides people working at the intersection of libraries/archives/museums and technology with a chance to share ideas, be inspired, and forge collaborations. The conference will be held at the *Sheraton Raleigh Hotel in downtown Raleigh, NC