[CODE4LIB] Job: Information Commons e-Learning Librarian at Butler University
The Information Commons Librarian provides library leadership for the planning, management, andoversight of the Butler University Information Commons (IC) program. The IC program is a student staffed, research and technology support service provided in partnership with the Center for AcademicTechnology. This position provides the vision for initiatives that support the development of studentinformation literacy competencies through peer interaction and plays a key role in integrating libraryservices into Butler University's developing e-learning curricula. Additionally, the position overseeshiring, training, and management of IC student employees employed by Butler Libraries and acts aslibrary liaison to one or more academic departments. The position reports to Butler Libraries'Associate Dean of Public Services. Essential Duties and Responsibilities include: * working in partnership with the Center for Academic Technology to develop, direct, and assessshared Information Commons projects and initiatives that support the University curriculumand address evolving end-user needs; * collaborating with liaison librarians, faculty, and staff to create online learning tools andresources (e.g., online tutorials, web-casting instruction) that support the development ofinformation literacy competencies; * hiring, training, and managing the Information Commons Assistants/Associates (approximatelytwenty students) who are employed by Butler Libraries; * serving as a liaison to an academic department, responsible for course-based informationliteracy instruction, collection development, and promoting faculty awareness of new deliverymodes and issues in scholarly communication; * providing leadership in the adoption of emerging instructional technologies and trends in onlinelearning relevant to Butler's academic mission Desired Knowledge, Skills, and Abilities: * Understand the broad universe of information and information-seeking processes to structurelibrary services for users * Understand and apply principles of learning and instructional design into information literacyactivities and library instruction * Use communication and interpersonal skills to interact effectively in a collaborative workenvironment * Understand and apply best practices in effective supervision * Use marketing and outreach skills to promote library resources and services as appropriate * Apply project management skills to plan, implement, and assess initiatives that align with thelibrary's mission * Integrate use of relevant current technologies and tools into everyday practice anddemonstrate their value to others * Work collaboratively and effectively with diverse groups, including students, faculty, and staff Minimum Qualifications: * Master's of Library Science from an ALA-accredited institution and ability to meetminimum qualifications for the rank of Assistant Professor as stated in20.30.30.B.2.a of the Butler University Faculty Handbook. Preferred Qualifications: * Master's of Library Science from an ALA-accredited institution, second graduatedegree in instructional design or related area, and ability to meet minimumqualifications for the rank of Assistant Professor as stated in 20.30.30.B.2.a of the Butler University Faculty Handbook. Brought to you by code4lib jobs: http://jobs.code4lib.org/job/6748/
[CODE4LIB] Job: Metadata and Discovery Services Librarian at University of New Mexico
The University of New Mexico Libraries (UL) has an opening for a Metadata and Discovery Services Librarian. Reporting to the Director of Discovery, Acquisitions, and Consortial Services, this position is a full-time, 12 month, probationary appointment leading to a tenure decision. The faculty rank and tenure status are negotiable based on qualifications. The anticipated start date is June 1, 2013. The minimum annual salary is $50,000 and is negotiable based on qualifications. This position includes full benefits. Working in a team-oriented and highly electronic environment, the Metadata and Discovery Services Librarian will play an important role in an organization that is committed to making content in all formats more accessible and discoverable for educational and research purposes. The Metadata and Discovery Services Librarian will take full advantage of and contribute to the evolution of multiple metadata languages and new discovery tools and platforms, especially as the UL is in the process of selecting a new ILS and compatible discovery tools. The Metadata and Discovery Services Librarian will keep current with developments in widely used metadata languages such as but not limited to: RDA, Dublin Core, VRA Core, and EAD. This librarian will be responsible for leading projects to improve the UL's metadata, and for ongoing training of staff to keep skills in the UL current with needs. The librarian will work within the UL's integrated library system, DSpace, and CONTENTdm. This position will work closely with Cataloging and Discovery Services, the LIBROS Library Consortium of New Mexico academic libraries, and all other departments of the UL. The Metadata and Discovery Services Librarian will play an active role in the UL's Web Committee and the SearchUNM team, providing expertise on good web and search design. The UL integrates into all we do the UNM values of Excellence, Access with Support to Succeed, Integrity, Diversity, Respectful Relationships, Freedom, and Sustainability. The UL adds to UNM's values: Service, Trust, Collaboration, and Accountability. Primary Duties The Metadata and Discovery Services Librarian will be responsible for: quality control of metadata created and used by the UL in all its platforms, whether individually created or batchloaded; making recommendations to the Library for metadata workflow; implementing new discovery tools and maintenance of those tools; Search Engine Optimization for UL web pages; being an active member of the LIBROS Coordination Team and providing expertise for that team in public interface design and systems; collaborate with Data Librarians for metadata components of data management plans; ongoing training of Cataloging and Discovery Services Staff in metadata practice; teaching for-credit Metadata Course for the UL. The Metadata and Discovery Services Librarian will raise awareness among library staff and the entire campus community about emerging trends in content discovery methods. The Metadata and Discovery Services Librarian will contribute to local, regional, and national metadata initiatives. The Metadata and Discovery Services Librarian will participate in faculty governance meetings and in library management meetings as required. The Metadata and Discovery Services Librarian will contribute to Library initiatives that further UNM's commitment to diversity and inclusion. UNM's confidentiality policy (Disclosure of Information about Candidates for Employment, UNM Board of Regents' Policy Manual 6.7), which includes information about public disclosure of documents submitted by applicants, is located at http;//www.unm.edu/~brpm/r67.htm The University of New Mexico is an Equal Employment Opportunity/Affirmative Action Employer and Educator. Minimum Qualifications: * Earned Master's degree from an ALA-accredited Library/Information Science program or an international equivalent; * Three years of experience (36 months) managing or coordinating metadata workflow in an academic setting within the last five years. Preferred Qualifications: * Experience with XML, XSLT, and Dublin Core; * Experience with additional metadata schema; * Demonstrated knowledge of AACR2, MARC, and RDA; * Experience with web development; * Demonstrated understanding of developments in linked data; * Experience working on teams across the library and outside the library; * Demonstrated knowledge of discovery systems in academic libraries; * Demonstrated knowledge of Library Integrated Systems in academic libraries; * Experience planning and facilitating training for library staff; * Experience implementing search tools such as the Google Search Appliance or similar; * Experience teaching graduate level metadata courses; * Excellent oral, written and interpersonal communication skills; and * Demonstrated ability to work effectively with culturally diverse populations. Brought to you by code4lib jobs:
[CODE4LIB] Job: Scholarly Communication Librarian at Butler University
Butler University Libraries invite applications for a Scholarly Communication Librarian, a 12-month, non-tenured (continuing appointment) position with the rank of assistant professor or associate professor, DOQ. This position provides leadership for scholarly communication and digitization initiatives at Butler University Libraries. Responsibilities: Scholarly communication is a strategic priority for Butler Libraries, and this position is responsible for managing and developing the library's institutional repository, digital publishing initiatives, and digitization projects. The librarian in this position leads outreach initiatives to faculty and others on issues relevant scholarly publishing, including author rights, open access (OA), and alternative publishing trends related to tenure and promotion. The position also serves as the library's primary resource on intellectual property issues that pertain to library collections and services. As a library faculty member, the Scholarly Communications Librarian provides library instruction, collection development, and research support for a selected college or department(s) and engages in scholarly and service activities. This position reports to the Associate Dean for Technical Services. To read the full position description, click here. Description of institution/organization: Butler University's mission is to provide high quality, integrated liberal arts and professional education programs built upon interactive dialogue and critical inquiry. Library services are central to this mission. In Fall 2012 Butler Libraries engaged in a strategic planning process that resulted in a new vision: Where Knowledge Inspires Transformation. This position supports the new vision, priorities, and strategic goals to realign library services to meet the current and future information needs of students and faculty. Minimum and Preferred Qualifications: Successful candidates must have a Master's of Library Science from an ALA-accredited institution and the ability to meet minimum qualifications for the rank of Assistant Professor in librarianship, scholarship, and service as stated in 20.30.30.B.2.a of the Butler University Faculty Handbook. Candidates with a Juris Doctor degree and the ability to meet minimum qualifications for the rank of Associate Professor as stated in 20.30.30.B.2.b of the Butler University Faculty Handbook will receive preference. Applicants for the position should submit a letter of interest, curriculum vitae, statement of teaching philosophy, and the names and contact information for three professional references to: Josh Petrusa, Associate Dean for Technical Services at: jpetr...@butler.edu. Applicants planning to attend the ACRL conference in Indianapolis should indicate travel dates in their letter of interest. Screening of applications will begin March 25, 2013, with a start date of August 1, 2013. Butler University is committed to enhancing the diversity of the student body and our faculty and staff. It is the policy of the University to provide equal opportunities for employment and advancement for all individuals, regardless of age, gender, race, religion, color, disability, veteran status, sexual orientation, national origin, or any other legally protected category. Email jpetr...@butler.edu to apply for this job. Brought to you by code4lib jobs: http://jobs.code4lib.org/job/6752/
[CODE4LIB] Job: Electronic Records Analyst at Ohio
The Attorney General's Office is currently seeking an Electronic Records Analyst in the Records Management Section. The duties for this position include but are not limited to: iManage/IRM System Administration: creates new matters in iManage and conducts conflict research; runs and manages retention/disposition reports and authorizations; assists sections with indexing, barcoding, labeling and managing of hardcopy records in relation to the system. Review and Creation of Records Retention Schedules: conducts records analysis meetings with sections in order to gather enough information to draft a retention schedule; conducts legal/compliance research on laws, regulations and standards affecting the retention of records; monitors state RIMS system for status of retention schedules and provides proper notice and filing when they have been approved for use. Review and Approve Records Disposal Request Forms: reviews and approves records disposal request forms submitted by sections to verify that the records are eligible for disposal and have been documented properly. Other duties as assigned: develops and conducts staff training; other duties as requested by Senior Records Manager. Minimum Qualifications: Masters degree in library information science or public history, CRM, or 3 years of professional records management experience. Preferred Qualifications: Experience with document/records management systems. To read full announcement and apply, go to [Ohio's job site](http://jobsearch.ohiomeansjobs.monster.com/) and enter Electronic Records Analyst in the job title field. Brought to you by code4lib jobs: http://jobs.code4lib.org/job/6754/
[CODE4LIB] Job: Project Archivist at Whitney Museum of American Art
The Frances Mulhall Achilles Library, Whitney Museum of American Art, seeks a candidate to fill a part-time, grant-funded position for a Project Archives Assistant who will follow the Whitney's established policies and procedures for processing archives. Responsibilities include, but are not limited to, the following: * Processing institutional archives (materials include paper documents, photographs, and varied ephemera) and apply appropriate professional standards following DACS rules; * Have a high level of understanding and experience using Archivists' Toolkit ™(AT), for data entry; * Train and supervise interns to work with AT; * Create finding aids * Respond to archives reference questions both in-person and remotely from both Museum staff and outside researchers * Able to lift 40 lb. boxes Requirements: MLS or MLIS degree; archives certificate preferred; experience using Archivists' Toolkit™ is a must, Encoded Archival Description, XML and familiar with MARC/AMC, DACS and VRA standards; strong written and verbal communication, people, and organizational skills. If interested, please email your resume, cover letter, and salary requirements to: libr...@whitney.org Brought to you by code4lib jobs: http://jobs.code4lib.org/job/6757/
[CODE4LIB] Job: Access Services/Cultural Heritage Manager at Alabama State University
The selected applicant will oversee access services for the division of archives and cultural heritage to include the management of access services for print, multimedia, digital archives and public history centers within the division. This position will assist the archivist in the area of university archives, special collections and cultural heritage programs, will serve as assistant to the Director of the National Center for the Study of Civil Rights and African-American Culture and other museum projects. The Access Services/ Cultural Heritage Manager will report to the Archivist and provide oversight/management of access to archival and museum collections; provide interlibrary/document delivery, electronic services and other general access services for all archival and cultural heritage departments; assist in marketing archival services to the academic community; host public library and cultural programs in conjunction with the archivist to achieve approved goals and objectives; provide curatorial services within the department and to other departments/museum programs; train and supervise staff in planning, organizing, coordinating and measuring of work activities; participate in team based instructional projects and serve on various library teams; manage the digitization activities of the division by providing online access to archival and museum information/collection; evaluate the collection of electronic products for strengths and weakness; develop and coordinate access policies and procedures; assist with enhancements of the library's web page; work departmental desk on an as needed basis, to include evenings and weekends in rotation; conduct and manage the day-to-day operations of the National Center and its staff with direct reporting to the center's director and perform other duties as assigned. Minimum Qualifications: A Master's degree from an ALA accredited library program or associated field to include archives, public history, museum management and 3 years of professional library or cultural heritage organization experience and some management/supervisory experience are required. Doctorate degree is preferred. Brought to you by code4lib jobs: http://jobs.code4lib.org/job/6758/
[CODE4LIB] Job: Project Archivist at Brooklyn Historical Society
The Brooklyn Historical Society Othmer Library in Brooklyn, NY is seeking an energetic, team-oriented candidate for a full-time, 18 month appointment, grant-funded Project Archivist position. The successful candidate will report to the Director of Library and Archives. The Organization: BHS' Othmer Library houses the most comprehensive collection of Brooklyn- related materials in the world. In 1993, the U.S. Department of Education designated the Othmer Library as a major research libraryunder Title II-C of the Higher Education Act. Today the collection includes more than 100,000 books and pamphlets, 60,000 photographs and prints, 2,000 feet of archival collections, and more than 2,000 maps and atlases. These materials include family histories, rare books, periodicals, serials, journals, personal papers, institutional records, and oral histories that document Brooklyn's many different ethnic groups and neighborhoods. We draw from these holdings to create interpretive exhibitions that prompt students, scholars and members of the general public to reconsider the fundamental facts of history in light of primary source documents and artifacts. BHS serves more than 45,000 people annually by providing opportunities for civic dialogue and community engagement for children and adults through exhibit tours, public programming, research opportunities, educational programs for New York City students, and professional development workshops and written curricula for teachers. Job Responsibilities: The successful candidate will be responsible for processing, arranging, and describing the Brooklyn Corporate Counsel records, a collection of unprocessed legal documents that encompass the period from ca. 1820 to ca. 1920, when Brooklyn formed as in independent city and then consolidated with the other boroughs to form New York City. Using Archivist's Toolkit to create an EAD finding aid according the standards set forth in BHS's archival processing manual and Describing Archives: a Content Standard (DACS), the Project Archivist will also be responsible for exporting that descriptive record from the Toolkit and importing in to a variety of other systems for public access; updating and maintaining procedures and policies; and providing information for reports to the granting agency. In addition to survey project responsibilities, the Project Archivist may cover the reference desk during the library's open hours up to 2 times per month, and other responsibilities as assigned. Required Qualifications: Masters in Library and Information Science or History, or equivalent degree, with a specialization in archival studies and completion of a library cataloging course; Demonstrated understanding of archival collections and principles of arrangement and description through a completed finding aid or other description tool; Effective oral and written communication skills; Ability to work as both independently and as part of a team; Strong organization and time-management skills; attention to accuracy and detail is essential; Familiarity with MARC and EAD; AACR2 and DACS; and with the use and application of standardized vocabularies; Supervisory experience, either within an archive or another work setting; Ability to lift, bend, and reach boxes or volumes weighing up to 40 lbs repeatedly, including handling these materials while standing on rolling ladders and stepstools; Ability to work in library stacks in cold temperatures (60-65 degrees Fahrenheit for up to an eight-hour workday, five days a week for 18 months; and Demonstrated reliable attendance to ensure successful and timely project completion. Preferred: 2-3 years post-MLS processing experience; Previous experience working with CMS and ILS systems; familiarity with Wordpress content management systems; experience specifically with Archivists'Toolkit and/or Ex Libris Primo and Aleph is highly desirable; Previous archival processing and description experience, including an understanding of pragmatic and efficient processing procedures; Undergraduate degree in history. A working knowledge of U. S. history is needed, to determine how collections fit into state and national issues for purposes of cataloging; knowledge of legal processes and terminology; and knowledge of Brooklyn or New York history is preferred; Experience handling and providing basic preservation treatments for historic materials. Compensation: Salary starts at $40,000 a year, dependent on experience and qualifications. Benefits include full medical and dental benefits; sick and vacation days; and optional pre-tax public transportation payroll deduction. This is a temporary, grant-funded position which will not extend past the grant period, ending December 31, 2014. How To Apply: Applicants should apply with a cover letter that includes a complete statement of qualifications; a full resume of their education and relevant experience; and the names, addresses, and phone
[CODE4LIB] Job: Electronic Resources Librarian at University of Maryland University College
University of Maryland University College (UMUC) seeks an Electronic Resource Librarian in the Information and Library Services (ILS) Department. Reporting to the Assistant Director for Electronic Resources for ILS the Electronic Resources Librarian will assist with the selection and acquisition process, vendor contacts and negotiations, budget planning and monitoring, financial management, statistical reporting, and maintenance and access projects. Works with electronic resources manager on Procurement Department and Office of Legal Affairs matters and projects as well as assisting with all ILS license agreements. Works closely with other ILS staff to implement electronic resource systems and manage library information systems, and represents UMUC on external committees related to electronic resources. SPECIFIC RESPONSIBILITIES INCLUDE: * Reports to the electronic resources manager for ILS to assist with the selection and acquisition process, vendor contacts and negotiations, budget planning and monitoring, financial management, statistical reporting, and maintenance and access projects. * Works with electronic resources manager on Procurement Department and Office of Legal Affairs matters and projects as well as assisting with all ILS license agreements. * Works closely with other ILS staff to implement electronic resource systems and manage library information systems, and represents UMUC on external committees related to electronic resources. * Represent UMUC on external committees related to electronic resources. * Perform other job-related duties as assigned. REQUIRED EDUCATION AND EXPERIENCE: * ALA-accredited Masters of Library Science or equivalent; minimum of 5 years of experience in an academic library or similar setting. PREFERRED EDUCATION AND EXPERIENCE: Experience with a discovery system, preferably EBSCO Discovery Service; experience with OpenURL systems, SFX or A-Z/LinkSource; experience with an electronic resources management systems, preferably EBSCO ERM Essentials; familiarity with emerging information technologies, and web technologies; ability to participate in and lead digital initiatives in collaboration with librarians, faculty, and university administration; experience using spreadsheets; experience with Microsoft products; experience communicating with vendor; experience with library consortia; ability to collaborate and work with library faculty and staff, faculty in academic departments, and other staff at a university; ability to work well individually and within a team in an non-traditional academic and evolving work environment; experience in the distance education environment; experience in project management; strong analytical, problem solving, and organizational skills, as well as attention to detail; excellent interpersonal, oral, and written communication skills; strong writing ability; conference presentation experience; knowledge about the process of selection, acquiring, evaluating, and licensing of electronic resources; knowledge about establishing and maintaining budget guidelines for electronic resources; capable of handling financial and accounting matters related to electronic resources management; knowledge of electronic resource statistical reports and providing assessment reports; able to manage access issues and maintenance of electronic resources; academic degree in computer science Academic degree in business, finance, or accounting. POSITION AVAILABLE IMMEDIATELY WILL REMAIN OPEN UNTIL FILLED SALARY COMMENSURATE WITH EXPERIENCE All submissions should include a cover letter and resume. UMUC offers an excellent benefits package to include up to 8 credits of tuition remission per semester, a minimum of 22 days of leave, and a range of insurance options. For detailed benefits information, please visit http://www.umuc.edu/visitors/careers/benefits.cfm UMUC - an Equal Opportunity Employer. The University distributes an annual information report which includes campus security information that is available to prospective employees. Brought to you by code4lib jobs: http://jobs.code4lib.org/job/6766/
[CODE4LIB] Job: Integrated Technologies Librarian at Lafayette College
Lafayette College seeks a service-oriented and creative Integrated Technologies Librarian to join its new Digital Scholarship Services program. The successful candidate will share responsibility for the library's ILS (innovative Interfaces' Sierra), will lead UI/UX design and the use of web analytics tools for digital library projects, and will investigate and implement technologies to improve discovery, access, and delivery of digital resources. Qualifications: ALA-accredited MLS or the equivalent; knowledge of current and emerging technologies in academic librarianship; ability to develop creative and innovative approaches to improving the user experience; expertise in XHTML, CSS, Javascript/jQuery; understanding of both public and technical service environments; ability to work collegially and communicate effectively with a wide range of audiences; ability to understand and convey meaningful information about technical problems to vendors and the college's central IT unit. Candidates with experience administering Drupal and/or institutional repository software, a history of user interface development, additional programming knowledge, or with keen interest in and strong potential for innovative digital library development work will receive special consideration. Compensation: salary commensurate with qualifications and experience; excellent benefits, including college tuition support for children. The library strongly encourages and supports professional development. For consideration, please submit a resume, cover letter addressing job qualifications, and three professional references to: Neil McElroy, Dean of Libraries, Lafayette College, Easton, PA 18042 or via email to: caste...@lafayette.edu. Brought to you by code4lib jobs: http://jobs.code4lib.org/job/6768/
[CODE4LIB] Job: Library Technologies Support Analyst at Ball State University
Responsibilities: Provide ongoing support, administration, analysis and development of computer/data systems and processes critical to the services and operations of University Libraries; integrate new library information technology solutions into existing library systems. Minimum qualifications: Bachelor's or master's degree in computer science, information systems/technology, MIS, or related field at time of appointment; one year of experience supporting, configuring, administering, troubleshooting, and networking; training experience; in-depth knowledge of various computer platforms, Windows, Mac, UNIX; working knowledge of Microsoft Office applications; effective oral and written communication skills; ability to work some evenings and/or weekends. Preferred qualifications: Master's degree in computer science, applied technology, information systems/technology, MIS, or related field with strong emphasis in information system management; one year of experience in information systems support, project management, and administration of an integrated library system, such as SirsiDynix Symphony, interlibrary loan management systems, OpenURL link resolvers, and federated search engines; demonstrated experience using Perl, PHP, ASP.NET, JavaScript, Java and/or other web interface technologies. Salary up to $44,950 plus excellent benefits. Send cover letter, resume, transcript of highest degree earned (unofficial copies acceptable), and the names and contact information for three references (at least one of which is a current or former supervisor) to: Dr. Arthur W. Hafner Dean of University Libraries Ball State University Muncie, IN 47306. Review of applications will begin immediately and will continue until the position is filled. www.bsu.edu/library Brought to you by code4lib jobs: http://jobs.code4lib.org/job/6770/
[CODE4LIB] Job: Summers Internship at Argonne National Laboratory
Argonne National Laboratory has an immediate opportunity for a graduate student enrolled in a MLS program to fill a 2013 summer student position with the Research Library. The primary focus of this position would be to review and update metadata of Argonne-authored publications, assist in documenting and organizing publishing procedures and collaborate with other staff members on future plans for the sharing and re-use of publication data. Ideal applicants would have completed coursework relevant to the role libraries play in the scholarly communication process such as: metadata standards, scholarly publishing, e-Science, intellectual property rights, digital curation and digital libraries. An undergraduate degree in a scientific discipline is preferred but is not required. Applicants must be U.S. citizen. Interested candidates should emaillisar...@anl.gov Brought to you by code4lib jobs: http://jobs.code4lib.org/job/6773/
[CODE4LIB] Omeka and BookReader?
I am hoping someone with experience with either Omeka or the Internet Archive BookReader can offer some advice. There is a plugin for Omeka that uses the BookReader: https://github.com/jsicot/BookReader. I installed the plugin in one of my Omeka projects, and it works fine, except the jpeg files are also displaying on the item page, as in this example – http://www.stonecampbellmovement.com/items/show/146. I think I will have to modify the php files in the Omeka theme as far as how it displays item pages, and not modify the BookReader plugin. Anyway, I just wanted to get some pointers from any of you with experience with Omeka or the BookReader, since I haven’t modified php files before, and this will be my first attempt. Thanks! Lisa Gonzalez Electronic Resources Librarian Catholic Theological Union 5401 S. Cornell Ave. Chicago, IL 60615 773-371-5463 lgonza...@ctu.edu
[CODE4LIB] web-based ocr
Does anybody here know of a Web-based OCR program or Web service? Many people want to do OCR against digitized texts. We all know of various OCR applications (Adobe Acrobat, ABBYY FineReader, Google's Tesseract, etc.), but they are not necessarily Web-based. As a service to my university, I thought it might be cool (or kewl) to support an image to text application. Go to Web form. Submit one or more image files. Have OCR done against them no matter how dirty the output. Return plain text. As a bonus, the application would support a REST-ful API. Does anybody know of something like this that exists already? -- Eric Lease Morgan University of Notre Dame
Re: [CODE4LIB] web-based ocr
Hi, I recently looked into similar services... There are some cloud based vendors that do this. Abbyy, for example, offers one. But the cost seems rather high when working in bulk. I did the math and it didn't make sense for usI think they market it towards people building mobile apps, not scanning books. Luckily, the Internet Archive OCRs documents uploaded to it for free. And the OCR results are pretty good (or better than I ever got with Tesseract) . So I use that a lot. However, you have to upload your document in a specific zipped up package... I don't think there's a generic web form. For something like that, I'd suggest ... Google Drive. It OCRs documents fairly well, although they have a size limit. We're using Google Apps for Education as our Digital Repository, so that works pretty well for a lot of our smaller documents... b,chris. Eric Lease Morgan wrote: Does anybody here know of a Web-based OCR program or Web service? Many people want to do OCR against digitized texts. We all know of various OCR applications (Adobe Acrobat, ABBYY FineReader, Google's Tesseract, etc.), but they are not necessarily Web-based. As a service to my university, I thought it might be cool (or kewl) to support an image to text application. Go to Web form. Submit one or more image files. Have OCR done against them no matter how dirty the output. Return plain text. As a bonus, the application would support a REST-ful API. Does anybody know of something like this that exists already? -- Eric Lease Morgan University of Notre Dame
Re: [CODE4LIB] web-based ocr
Am 12.03.2013 16:57, schrieb Eric Lease Morgan: Does anybody know of something like this that exists already? We are running something like this. Not with a HTML or REST-ful front end, but WebDAV. The users of this service do mass digitization. They mount their individual WebDAV share, push scanned image files there and read the OCR results from output files (usually not by hand but with some software that manages their digitization workflow). The actual OCR is done by an ABBYY Recognition Server, the WebDAV front end including accounting is a straightforward home-brewed solution. Till -- Till Kinstler Verbundzentrale des Gemeinsamen Bibliotheksverbundes (VZG) Platz der Göttinger Sieben 1, D 37073 Göttingen kinst...@gbv.de, +49 (0) 551 39-13431, http://www.gbv.de
Re: [CODE4LIB] Omeka and BookReader?
I would be interested in any information on this plugin as well. I'm having the same display problem as Lisa and also would like to get this plugin to work with PDFs. So far I've only had luck with jpegs. Any assistance is appreciated. Thanks! Shannon Showers Digital Projects Librarian Washington University -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Lisa Gonzalez Sent: Tuesday, March 12, 2013 10:56 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] Omeka and BookReader? I am hoping someone with experience with either Omeka or the Internet Archive BookReader can offer some advice. There is a plugin for Omeka that uses the BookReader: https://github.com/jsicot/BookReader. I installed the plugin in one of my Omeka projects, and it works fine, except the jpeg files are also displaying on the item page, as in this example - http://www.stonecampbellmovement.com/items/show/146. I think I will have to modify the php files in the Omeka theme as far as how it displays item pages, and not modify the BookReader plugin. Anyway, I just wanted to get some pointers from any of you with experience with Omeka or the BookReader, since I haven't modified php files before, and this will be my first attempt. Thanks! Lisa Gonzalez Electronic Resources Librarian Catholic Theological Union 5401 S. Cornell Ave. Chicago, IL 60615 773-371-5463 lgonza...@ctu.edu
[CODE4LIB] Handwriting and ocr
On a related note, I am looking for a recommendation for software that provides OCR for handwriting (print and/or cursive). To clarify, this would be pen ink on paper not digital ink. Thank you, Donna R. Campbell Technical Services Systems Librarian (215) 935-3872 (phone) (267) 295-3641 (fax) Mailing Address (via USPS): Westminster Theological Seminary Library P.O. Box 27009 Philadelphia, PA 19118 USA Shipping Address (via UPS or FedEx): Westminster Theological Seminary Library 2960 W. Church Rd. Glenside, PA 19038 USA -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Eric Lease Morgan Sent: Tuesday, March 12, 2013 11:57 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] web-based ocr Does anybody here know of a Web-based OCR program or Web service? Many people want to do OCR against digitized texts. We all know of various OCR applications (Adobe Acrobat, ABBYY FineReader, Google's Tesseract, etc.), but they are not necessarily Web-based. As a service to my university, I thought it might be cool (or kewl) to support an image to text application. Go to Web form. Submit one or more image files. Have OCR done against them no matter how dirty the output. Return plain text. As a bonus, the application would support a REST-ful API. Does anybody know of something like this that exists already? -- Eric Lease Morgan University of Notre Dame
Re: [CODE4LIB] Handwriting and ocr
I don't think that would be possible to OCR handwriting. As I can remember, the result are pretty useless. Unless using something like recaptcha. Kun -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Donna Campbell Sent: Tuesday, March 12, 2013 1:56 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] Handwriting and ocr On a related note, I am looking for a recommendation for software that provides OCR for handwriting (print and/or cursive). To clarify, this would be pen ink on paper not digital ink. Thank you, Donna R. Campbell Technical Services Systems Librarian (215) 935-3872 (phone) (267) 295-3641 (fax) Mailing Address (via USPS): Westminster Theological Seminary Library P.O. Box 27009 Philadelphia, PA 19118 USA Shipping Address (via UPS or FedEx): Westminster Theological Seminary Library 2960 W. Church Rd. Glenside, PA 19038 USA -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Eric Lease Morgan Sent: Tuesday, March 12, 2013 11:57 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] web-based ocr Does anybody here know of a Web-based OCR program or Web service? Many people want to do OCR against digitized texts. We all know of various OCR applications (Adobe Acrobat, ABBYY FineReader, Google's Tesseract, etc.), but they are not necessarily Web-based. As a service to my university, I thought it might be cool (or kewl) to support an image to text application. Go to Web form. Submit one or more image files. Have OCR done against them no matter how dirty the output. Return plain text. As a bonus, the application would support a REST-ful API. Does anybody know of something like this that exists already? -- Eric Lease Morgan University of Notre Dame
Re: [CODE4LIB] web-based ocr
Something like this is on my to do list for our future Fedora Commons deployment here at UConn. I was considering wrapping a SOAP interface around something like the Perl Image::OCR::Tesseract module and adding it to our ingest pipeline unless someone can recommend a better OCR application. Rick -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Till Kinstler Sent: Tuesday, March 12, 2013 12:30 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] web-based ocr Am 12.03.2013 16:57, schrieb Eric Lease Morgan: Does anybody know of something like this that exists already? We are running something like this. Not with a HTML or REST-ful front end, but WebDAV. The users of this service do mass digitization. They mount their individual WebDAV share, push scanned image files there and read the OCR results from output files (usually not by hand but with some software that manages their digitization workflow). The actual OCR is done by an ABBYY Recognition Server, the WebDAV front end including accounting is a straightforward home-brewed solution. Till -- Till Kinstler Verbundzentrale des Gemeinsamen Bibliotheksverbundes (VZG) Platz der Göttinger Sieben 1, D 37073 Göttingen kinst...@gbv.de, +49 (0) 551 39-13431, http://www.gbv.de
Re: [CODE4LIB] web-based ocr
Thank you for the prompt replies. Call me cheap or unable to navigate the political/fiscal landscape, but I don't see myself subscribing to a service. Instead I see putting a wrapper around Tesseract, but alas, the wrappers are written in languages that I don't know. [1] Hmmm… On the Perl side, I am having problems installing Image::OCR::Tesseract. [1] Wrappers - http://code.google.com/p/tesseract-ocr/wiki/AddOns -- Eric Still Cogitating Morgan
Re: [CODE4LIB] Handwriting and ocr
If it's for a discrete project, I'd say scan what you need OCR'd and put it on Mechanical Turk kyle On Tue, Mar 12, 2013 at 10:56 AM, Donna Campbell dcampb...@wts.edu wrote: On a related note, I am looking for a recommendation for software that provides OCR for handwriting (print and/or cursive). To clarify, this would be pen ink on paper not digital ink. Thank you, Donna R. Campbell Technical Services Systems Librarian (215) 935-3872 (phone) (267) 295-3641 (fax) Mailing Address (via USPS): Westminster Theological Seminary Library P.O. Box 27009 Philadelphia, PA 19118 USA Shipping Address (via UPS or FedEx): Westminster Theological Seminary Library 2960 W. Church Rd. Glenside, PA 19038 USA -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Eric Lease Morgan Sent: Tuesday, March 12, 2013 11:57 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] web-based ocr Does anybody here know of a Web-based OCR program or Web service? Many people want to do OCR against digitized texts. We all know of various OCR applications (Adobe Acrobat, ABBYY FineReader, Google's Tesseract, etc.), but they are not necessarily Web-based. As a service to my university, I thought it might be cool (or kewl) to support an image to text application. Go to Web form. Submit one or more image files. Have OCR done against them no matter how dirty the output. Return plain text. As a bonus, the application would support a REST-ful API. Does anybody know of something like this that exists already? -- Eric Lease Morgan University of Notre Dame
Re: [CODE4LIB] web-based ocr
Hi, In regards to handwriting, you could always train an OCR library to do this and there are several OCR libraries that attempt to do this out-of-the-box (probably most notable is Evernote) ...but yeah, the results vary greatly depending on the style of writing. Most focus on just hand printed things like post-its. And a quick thing I found out recently about Tesseract. It is pretty good if all you want is the text extracted. It does not do layout recognition very well, so output will look funky if there's layout oddities...like footnotes. But it really depends on what you have and what you're trying to do. For example, I did not have much success making EPUBS with Tesseract, but it worked great with our theses (which have manditory layout requirements). So another big bonus for using the Internet Archive (who, I think, use Abbyy). b,chris. Eric Lease Morgan wrote: Thank you for the prompt replies. Call me cheap or unable to navigate the political/fiscal landscape, but I don't see myself subscribing to a service. Instead I see putting a wrapper around Tesseract, but alas, the wrappers are written in languages that I don't know. [1] Hmmm… On the Perl side, I am having problems installing Image::OCR::Tesseract. [1] Wrappers - http://code.google.com/p/tesseract-ocr/wiki/AddOns -- Eric Still Cogitating Morgan
Re: [CODE4LIB] Handwriting and ocr
At the risk of shameless self-promotion, I would suggest an alternative to the attempt at using OCR for handwriting. My field of research focuses on pre-modern manuscripts which, to no one's surprise, have resisted any OCR method. One solution is to create an environment that makes transcribing an effective and efficient task. To that end, here at Saint Louis University, we built a web-based app called T-PEN. T-PEN attempts to identify the location of each line on a digital surrogate and then displays it with a text box underneath to ensure accurate transcription. The URL is t-pen.org. It's free for anyone. In addition to the repositories that have given us access, users can upload private images to work with. I know that this solution is not ideal for large sets of handwritten texts, but T-PEN does support crowd-sourcing (what we call public projects). You can also encode as you transcribe and then export the transcription as an XML document (and you can even export transcriptions in OAC currently as RDF/XML). There is introductory video at http://www.youtube.com/watch?feature=player_embeddedv=_81fJbOpTcE. Jim On Tue, Mar 12, 2013 at 2:00 PM, Kyle Banerjee kyle.baner...@gmail.comwrote: If it's for a discrete project, I'd say scan what you need OCR'd and put it on Mechanical Turk kyle On Tue, Mar 12, 2013 at 10:56 AM, Donna Campbell dcampb...@wts.edu wrote: On a related note, I am looking for a recommendation for software that provides OCR for handwriting (print and/or cursive). To clarify, this would be pen ink on paper not digital ink. Thank you, Donna R. Campbell Technical Services Systems Librarian (215) 935-3872 (phone) (267) 295-3641 (fax) Mailing Address (via USPS): Westminster Theological Seminary Library P.O. Box 27009 Philadelphia, PA 19118 USA Shipping Address (via UPS or FedEx): Westminster Theological Seminary Library 2960 W. Church Rd. Glenside, PA 19038 USA -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Eric Lease Morgan Sent: Tuesday, March 12, 2013 11:57 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] web-based ocr Does anybody here know of a Web-based OCR program or Web service? Many people want to do OCR against digitized texts. We all know of various OCR applications (Adobe Acrobat, ABBYY FineReader, Google's Tesseract, etc.), but they are not necessarily Web-based. As a service to my university, I thought it might be cool (or kewl) to support an image to text application. Go to Web form. Submit one or more image files. Have OCR done against them no matter how dirty the output. Return plain text. As a bonus, the application would support a REST-ful API. Does anybody know of something like this that exists already? -- Eric Lease Morgan University of Notre Dame -- -- James R. Ginther, PhD Professor of Medieval Theology, Associate Chair, Department of Theology Director, Center for Digital Theology Saint Louis University - gint...@slu.edu Faculty Page: Departmental Pagehttps://sites.google.com/a/slu.edu/james-ginther/ https://sites.google.com/a/slu.edu/james-ginther/Research Blog: http://digital-editor.blogspot.com Twitter: DH_editor http://twitter.com/#!/DH_editor T-PEN: www.tpen.org/ NOTE: This e-mail message may contain information that may be privileged, confidential, and exempt from disclosure. It is intended for use only by the person(s) to whom it is addressed. If you have received this message in error, please do not forward or use this information in any way; delete it immediately, and contact the sender as soon as possible by the reply option or by telephone at 314-977-4248.
[CODE4LIB] Google Drive as an IR
So, yeah, new thread. Sorry (I'm not sorry). tl;dr = it's not perfect but you'll never get access control/revision/fulltext searching functionality even if you spend ~1000x more. About using Google Driveyeah, we're very small ( 115 students!), so we're very interested in keeping our over-heads nice and low.. I'm guess I'm old enough to think that 100 GB for $5 a month is a pretty good deal, so we started saying Google Drive is our IR as a joke, but like it's actually turned into a really nice IR type thingy. We just added a generic library user in our domain and bought extra drive space it. We try to organize things orderly by keeping things in various folders ( Dissertations, Articles, UN Documents), since it makes it easier to recursively apply ACLs. GDrive is like AWS in that the folders are not really folders like we're used to on a file system, but more like tags..so if you move a file around, it keeps it UUID (and therefore URL), which is pretty nice. The best part is that since Google Apps uses OAuth, access control is really simple both in Google Apps and with external web apps. We can make a document open to the world, grant access to groups/individuals, only allow access if they have the URL, etc. This works if they search in Google Drive or if they're tying to access a document embedded on another site. The bad news is that there's not much (i.e. none) support in the way of descriptive metadata, which is kind of huge. To work around this, we currently either have descriptive metadata records kept in our ILS (Koha) or in our group Mendeley account. This adds a bit of complexity to managing the metadata and also means there's not a discovery interface that allows for both full-text searching (which google provides) and metadata searching (which Koha mostly provides). I wrote an app last summer that indexes some of this content into a discovery interface, which I'm actually in the process of merging back into our Blacklight OPAC so we'll have a unified DS9DE (Deep Space 9 Discovery Environment) . So, there's that... And slightly less bad news, OCR and the document viewer only supports files 20MB. We're have a lot of very large PDFs, so it's a bit of a drag, but the students just have to download the PDF, so it's not so bad. And the Google Drive desktop client can be buggy and crashes if you try and sync large collection. And you still have to figure out preservation (or not). But yeah, despite all that BS it's been pretty great. And since Google gives the CIA unlimited warrantless access, I assume that someone out there (i.e. DC metro area) is reading our content. Any questions, please feel free to ask me... b,chris.
Re: [CODE4LIB] Google Drive as an IR
On Mar 12, 2013, at 3:26 PM, chris fitzpatrick chrisfitz...@gmail.com wrote: About using Google Driveyeah, we're very small ( 115 students!), so we're very interested in keeping our over-heads nice and low.. I'm guess I'm old enough to think that 100 GB for $5 a month is a pretty good deal, so we started saying Google Drive is our IR as a joke, but like it's actually turned into a really nice IR type thingy…. 'Sounds like an article for Code4Lib Journal. Hint, hint. --Eric
Re: [CODE4LIB] Job: Digital Technologist at Gates Archive
Hi folks, This posting is still open. We are a new organization and don’t have an active website right now, so if you were interested in it but wondering who we were, here’s more background information about the Gates Archive: Gates Archive was formed in 2011 – we are capturing the personal and philanthropic archival collections of the Gates Family (such as the personal archives and the records of the Bill and Melinda Gates Foundation). A personal note about the archive – I moved from Chapel Hill (UNC) back to the Pacific Northwest after several years at academic libraries. I have been here for a little over a year and am amazed at how a program management focus and a great organizational culture has created a unique and special place to work. We are a new organization, but with this position posting, we are moving into the next phase. I am very excited about this job and think it will be a great opportunity for the right person. If you have any questions about the posting, don’t hesitate to get in touch. I’d be more than happy to talk or email in greater length. Best, Erin O'Meara On Mon, Feb 11, 2013 at 2:26 PM, j...@code4lib.org wrote: If you would like to apply, please email cover letter and resume to: care...@gatesarchive.com The Gates Archive is searching for an enthusiastic, collaborative, and creative digital technologist to work closely with the archive team in building out an innovative, new private family archive. This position works with technological systems to support the management, preservation and access of digital archival materials. This position requires relocation to the Pacific Northwest, and entails a rigorous background and security check. **Responsibilities:** * The successful candidate will bring expert-level knowledge to lead the development of the technical architecture for the archive, including: * Digital preservation strategy and policy development * Implementation and execution of digital asset management and digital preservation activities * System architecture development for the management of digital assets * Workflow review and refinement for the management of born-digital materials * Statistics compiling and reporting to improve digital asset management * Other organizational duties as required **Qualifications**: To perform this job successfully, an individual must be able to perform each essential duty with a high degree of accuracy. The requirements listed below are representative of the knowledge, skill, and/or ability required. Reasonable accommodations may be made to enable individuals with disabilities to perform essential functions. **Required Skills** * Demonstrated expertise in developing and implementing digital repository systems * Expert knowledge of server and storage architectures, IT middleware and relational databases * Demonstrated experience planning and managing technology projects * Demonstrated ability to work collaboratively and productively in a rapidly changing environment * Proven ability to prioritize work and meet multiple deadlines * Strong organizational and interpersonal communication skills * Demonstrated ability to communicate effectively, both verbally and in writing * Demonstrated knowledge of and ability to identify emerging trends in digital preservation **Preferred Skills** * Ability to translate complex business needs into functional requirements and system specifications * Familiarity with a variety of metadata standards (e.g. METS, MODS, and PREMIS) * Experience building hardware for the acquisition of digital media (e.g. configuring floppy drive controllers) * Experience with data transformations (e.g. XSLT, perl and regular expressions) * Experience creating technical and end-user documentation **Computer skills ** * Programming and database administration experience * MS Office * MS SharePoint (SP 2010 preferred) * Experience using database software and Internet search engines **Language Ability:** * Ability to read, analyze, and interpret general business periodicals, professional journals, technical procedures, or governmental regulations. Ability to write reports, business correspondence, and procedure manuals. Ability to speak effectively before groups of customers or employees of organization **Reasoning Ability**: * Ability to solve practical problems and deal with a variety of concrete variables in situations where only limited standardization exists. Ability to interpret a variety of instructions furnished in written, verbal, diagram, or schedule form **Education/experience/certifications** * A Bachelor's of Science degree in Computer Science, Information Science or equivalent combination of education and experience * Minimum of five years relevant professional experience * Relevant work experience in an archives or library **Working Conditions**:
Re: [CODE4LIB] Google Drive as an IR
On Mar 12, 2013, at 3:29 PM, Eric Lease Morgan emor...@nd.edu wrote: On Mar 12, 2013, at 3:26 PM, chris fitzpatrick chrisfitz...@gmail.com wrote: About using Google Driveyeah, we're very small ( 115 students!), so we're very interested in keeping our over-heads nice and low.. I'm guess I'm old enough to think that 100 GB for $5 a month is a pretty good deal, so we started saying Google Drive is our IR as a joke, but like it's actually turned into a really nice IR type thingy…. 'Sounds like an article for Code4Lib Journal. Hint, hint. --Eric Agreed! And Chris, if you are so inclined: http://journal.code4lib.org/call-for-submissions …and/or ask me if you have any questions. Peter (Code4Lib Journal coordinating editor for issue #20) -- Peter Murray Assistant Director, Technology Services Development LYRASIS peter.mur...@lyrasis.org +1 678-235-2955 800.999.8558 x2955
Re: [CODE4LIB] Handwriting and ocr
That's cool! I created an entry for T-PEN in FOSS4Lib (http://foss4lib.org/package/t-pen) so others can more easily find it. (Jim: I also had the FOSS4Lib site send you a login id/password so you can go in and update the T-PEN entry in case I got anything wrong.) Thanks for the self-promotion! Peter On Mar 12, 2013, at 3:10 PM, James Ginther gint...@slu.edu wrote: At the risk of shameless self-promotion, I would suggest an alternative to the attempt at using OCR for handwriting. My field of research focuses on pre-modern manuscripts which, to no one's surprise, have resisted any OCR method. One solution is to create an environment that makes transcribing an effective and efficient task. To that end, here at Saint Louis University, we built a web-based app called T-PEN. T-PEN attempts to identify the location of each line on a digital surrogate and then displays it with a text box underneath to ensure accurate transcription. The URL is t-pen.org. It's free for anyone. In addition to the repositories that have given us access, users can upload private images to work with. I know that this solution is not ideal for large sets of handwritten texts, but T-PEN does support crowd-sourcing (what we call public projects). You can also encode as you transcribe and then export the transcription as an XML document (and you can even export transcriptions in OAC currently as RDF/XML). There is introductory video at http://www.youtube.com/watch?feature=player_embeddedv=_81fJbOpTcE. Jim On Tue, Mar 12, 2013 at 2:00 PM, Kyle Banerjee kyle.baner...@gmail.comwrote: If it's for a discrete project, I'd say scan what you need OCR'd and put it on Mechanical Turk kyle On Tue, Mar 12, 2013 at 10:56 AM, Donna Campbell dcampb...@wts.edu wrote: On a related note, I am looking for a recommendation for software that provides OCR for handwriting (print and/or cursive). To clarify, this would be pen ink on paper not digital ink. Thank you, Donna R. Campbell Technical Services Systems Librarian (215) 935-3872 (phone) (267) 295-3641 (fax) Mailing Address (via USPS): Westminster Theological Seminary Library P.O. Box 27009 Philadelphia, PA 19118 USA Shipping Address (via UPS or FedEx): Westminster Theological Seminary Library 2960 W. Church Rd. Glenside, PA 19038 USA -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Eric Lease Morgan Sent: Tuesday, March 12, 2013 11:57 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] web-based ocr Does anybody here know of a Web-based OCR program or Web service? Many people want to do OCR against digitized texts. We all know of various OCR applications (Adobe Acrobat, ABBYY FineReader, Google's Tesseract, etc.), but they are not necessarily Web-based. As a service to my university, I thought it might be cool (or kewl) to support an image to text application. Go to Web form. Submit one or more image files. Have OCR done against them no matter how dirty the output. Return plain text. As a bonus, the application would support a REST-ful API. Does anybody know of something like this that exists already? -- Eric Lease Morgan University of Notre Dame -- Peter Murray Assistant Director, Technology Services Development LYRASIS peter.mur...@lyrasis.org +1 678-235-2955 800.999.8558 x2955
Re: [CODE4LIB] Handwriting and ocr
Peter Thanks so much. Your summary of T-PEN's design and capabilities is spot on! jim On Tue, Mar 12, 2013 at 4:12 PM, Peter Murray peter.mur...@lyrasis.orgwrote: That's cool! I created an entry for T-PEN in FOSS4Lib ( http://foss4lib.org/package/t-pen) so others can more easily find it. (Jim: I also had the FOSS4Lib site send you a login id/password so you can go in and update the T-PEN entry in case I got anything wrong.) Thanks for the self-promotion! Peter On Mar 12, 2013, at 3:10 PM, James Ginther gint...@slu.edu wrote: At the risk of shameless self-promotion, I would suggest an alternative to the attempt at using OCR for handwriting. My field of research focuses on pre-modern manuscripts which, to no one's surprise, have resisted any OCR method. One solution is to create an environment that makes transcribing an effective and efficient task. To that end, here at Saint Louis University, we built a web-based app called T-PEN. T-PEN attempts to identify the location of each line on a digital surrogate and then displays it with a text box underneath to ensure accurate transcription. The URL is t-pen.org. It's free for anyone. In addition to the repositories that have given us access, users can upload private images to work with. I know that this solution is not ideal for large sets of handwritten texts, but T-PEN does support crowd-sourcing (what we call public projects). You can also encode as you transcribe and then export the transcription as an XML document (and you can even export transcriptions in OAC currently as RDF/XML). There is introductory video at http://www.youtube.com/watch?feature=player_embeddedv=_81fJbOpTcE. Jim On Tue, Mar 12, 2013 at 2:00 PM, Kyle Banerjee kyle.baner...@gmail.com wrote: If it's for a discrete project, I'd say scan what you need OCR'd and put it on Mechanical Turk kyle On Tue, Mar 12, 2013 at 10:56 AM, Donna Campbell dcampb...@wts.edu wrote: On a related note, I am looking for a recommendation for software that provides OCR for handwriting (print and/or cursive). To clarify, this would be pen ink on paper not digital ink. Thank you, Donna R. Campbell Technical Services Systems Librarian (215) 935-3872 (phone) (267) 295-3641 (fax) Mailing Address (via USPS): Westminster Theological Seminary Library P.O. Box 27009 Philadelphia, PA 19118 USA Shipping Address (via UPS or FedEx): Westminster Theological Seminary Library 2960 W. Church Rd. Glenside, PA 19038 USA -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Eric Lease Morgan Sent: Tuesday, March 12, 2013 11:57 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] web-based ocr Does anybody here know of a Web-based OCR program or Web service? Many people want to do OCR against digitized texts. We all know of various OCR applications (Adobe Acrobat, ABBYY FineReader, Google's Tesseract, etc.), but they are not necessarily Web-based. As a service to my university, I thought it might be cool (or kewl) to support an image to text application. Go to Web form. Submit one or more image files. Have OCR done against them no matter how dirty the output. Return plain text. As a bonus, the application would support a REST-ful API. Does anybody know of something like this that exists already? -- Eric Lease Morgan University of Notre Dame -- Peter Murray Assistant Director, Technology Services Development LYRASIS peter.mur...@lyrasis.org +1 678-235-2955 800.999.8558 x2955 -- -- James R. Ginther, PhD Professor of Medieval Theology, Associate Chair, Department of Theology Director, Center for Digital Theology Saint Louis University - gint...@slu.edu Faculty Page: Departmental Pagehttps://sites.google.com/a/slu.edu/james-ginther/ https://sites.google.com/a/slu.edu/james-ginther/Research Blog: http://digital-editor.blogspot.com Twitter: DH_editor http://twitter.com/#!/DH_editor T-PEN: www.tpen.org/ NOTE: This e-mail message may contain information that may be privileged, confidential, and exempt from disclosure. It is intended for use only by the person(s) to whom it is addressed. If you have received this message in error, please do not forward or use this information in any way; delete it immediately, and contact the sender as soon as possible by the reply option or by telephone at 314-977-4248.
Re: [CODE4LIB] Handwriting and ocr
The Image and Spatial Data Analysis Group at the National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign presented at SAA 2012 on their work with OCR and handwritten census data. Very interesting and if I recall correctly there was mention of grant opportunities for work on large data sets. ISDA Census page: http://isda.ncsa.illinois.edu/drupal/project/census Presentation on slideshare: http://www.slideshare.net/NARACAST/free-and-searchable-access-to-the-1940-census-data - Brian --- Brian Wilson Digital Processing Archivist Archives and Library Benson** **Ford** **Research** **Center The Henry Ford 20900 Oakwood Blvd. Dearborn**, **MI** **48124 (313) 982-6100, x2293 bri...@thehenryford.org http://www.thehenryford.org/research --- On Tue, Mar 12, 2013 at 1:56 PM, Donna Campbell dcampb...@wts.edu wrote: On a related note, I am looking for a recommendation for software that provides OCR for handwriting (print and/or cursive). To clarify, this would be pen ink on paper not digital ink.