Re: [CODE4LIB] Where can I find a basic set of user stories for a digital library?
I don't know of a list like this that exists anywhere. I would however be willing to talk with you about our own collections and audiences. Perhaps a literature review may yield something like this. This DPLA tutorial on promoting digital collections provides some good tips on identifying audiences based on the content you have. http://dp.la/info/about/projects/public-library-partnerships/promoting-use-of-your-digital-content/ Erica *Erica Findley* Cataloging/Metadata Librarian Multnomah County Library Phone: 503.988.5466 eri...@multcolib.org multcolib.org <http://www.multcolib.org> On Wed, May 11, 2016 at 1:58 PM, Wilhelmina Randtke <rand...@gmail.com> wrote: > Does anyone have a set of user stories for a digital library that you'd be > willing to share? Or, is there a good place to look this up and pull a > set? > > I'm working with these types of materials: old photos, digitized books, > digitized newspapers, ETDs. Pretty much the basics of digital library > content. I'm interested in a listing of ways people would use these, so I > can better understand what the platforms I'm working with do well and where > gaps are. > > -Wilhelmina Randtke >
Re: [CODE4LIB] using drupal for a document repository
Good evening, Our system is based on Drupal although it was optimized for images and not documents. We are currently working on a better document display. This did require several hours of development so I am not sure it is what you were asking for exactly. If you want to know more about the inner workings of it I can put you in touch with someone to answer questions. https://gallery.multcolib.org/ Erica *Erica Findley* Cataloging/Metadata Librarian Multnomah County Library Phone: 503.988.5466 eri...@multcolib.org multcolib.org <http://www.multcolib.org> On Thu, May 5, 2016 at 4:28 PM, Cary Gordon <listu...@chillco.com> wrote: > You can build a peachy document repository in Drupal. This will work fine > if you have a small collection, say less than 10k items. > > The issue is that it won’t scale. As a Drupal fanboy, I would love to see > an all Drupal solution work, but, at least at this point, it doesn't. > > We work with Islandora, which puts a Drupal front-end on Fedora. OOTB, > Islandora is weighted towards the Fedora side, but the community has been > working to move the balance to do more in Drupal. This will be easier once > Islandora completes its move to Fedora 4. > > FWIW, we offer Islandora in a hosted and fully supported service package > that we call LibraryDAMS. > > Thanks, > > Cary > > > On May 5, 2016, at 2:15 PM, Kelsey Williamson < > kelseyfayesaw...@gmail.com> wrote: > > > > Hi code4lib, > > I was hoping to get some input on this. My small, scrappy institution is > > considering using drupal as a repository, primarily via the Biblio > module. > > > > Obviously this is not ideal, but for reasons I won't get into, our tech > > environment won't support ePrints or dspace, and hosted services are not > an > > option either. We do not really have the level of technical expertise > > required to support any fedora-based applications, and cannot hire any > > additional support. There's a chance existing staff could stretch to get > > there, but it would not be a pretty process. > > > > With all that said, do any red flags come to mind? I looked through both > > code4lib and drupal4lib listserv archives and poked around google, but > > didn't find much evidence of anyone else using drupal in this way. Seems > > suspicious. While my gut tells me it's a bad idea (metadata! standards! > > preservation!), I'm having trouble articulating this to my group in a way > > that sticks, because using Biblio would be easy. I would appreciate > hearing > > any other thoughts or opinions on this. > > > > Thanks! > > Kelsey >
[CODE4LIB] Do you use alt tags in your images for digital collections
Good evening, We are currently experiencing a dilemma with alt tags in our digital collections. We would like to include alt tags to be in compliance with accessibility guidelines. When looking at an item detail page <https://gallery.multcolib.org/image/widmer-kegs-trailer>, there is a lot of surrounding metadata to help visualize the image, but on our search results <https://gallery.multcolib.org/search/site> pages, that detail is not present. Currently a screen reader is not reading the titles of the images on our search results page. We are able to add alt tags to the image to help with this. Our dilemma is what those tags should be so they are not redundant of either the title or description metadata, but still helpful. Are any of you using alt tags in your images for digital collections similiar to ours? If so, what guidelines do you use to create those alt tags? Erica *Erica Findley* Cataloging/Metadata Librarian Multnomah County Library Phone: 503.988.5466 eri...@multcolib.org multcolib.org <http://www.multcolib.org>
Re: [CODE4LIB] Anyone Doing Interesting Things With Digital Collection Systems?
We just designed our own responsive site at Multnomah County Library for digital collections that is also OAI-PMH compatible. We call it The Gallery. https://gallery.multcolib.org/ Erica *Erica Findley* Cataloging/Metadata Librarian Multnomah County Library Phone: 503.988.5466 eri...@multcolib.org multcolib.org <http://www.multcolib.org> On Mon, Feb 29, 2016 at 4:29 AM, Scancella, John <j...@loc.gov> wrote: > Hi Matt, > > I work on the digital repository for the Library of Congress. We have a > lot of our tools on our public github > https://github.com/LibraryOfCongress > > Of particular interest would be the bagit-python, and bagit-java. Note > that for bagit-java we are in the middle of a rewrite so if you plan on > using it for more than the near term you should check out the > https://github.com/LibraryOfCongress/bagit-java/tree/rewrite branch or > BETA release > http://search.maven.org/#artifactdetails|gov.loc|bagit|5.0.0-BETA|jar > > John > Please note: all opinions expressed in this email are my own and do not > reflect those of The Library Of Congress > > -Original Message- > From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of > Erin Tripp > Sent: Monday, February 29, 2016 7:19 AM > To: CODE4LIB@LISTSERV.ND.EDU > Subject: Re: [CODE4LIB] Anyone Doing Interesting Things With Digital > Collection Systems? > > Hi Matt, > > The Islandora Community (http://islandora.ca/about) is releasing some > lovely open source digital repositories. Islandora is interoperable and > extensible through the Tuque API, the Islandora OAI module, and many other > tools that are included in the software stack. > > Here are a few repositories to explore: > http://dcmny.org/ > http://dlib.bc.edu/ > http://repository.lib.cuhk.edu.hk/ > http://arcabc.ca/ > > We have monthly webinars on Islandora if you'd like to join and learn more. > > ~ Erin > > Erin Tripp, BJH MLIS > Business Development Manager > discoverygarden inc. > e...@discoverygarden.ca >
Re: [CODE4LIB] Looking for a script to clean up OCR text files
Thanks everyone for your ideas and suggestions. There are many things I am going to take a look at here and perhaps this is a good time for me to learn some regular expressions. I also want to respond regarding my desire to clean up the formatting of the OCR data (line breaks, junk characters, spacing, etc.). In our current web platform for digital objects I input the OCR text in to a field (either manually or by batch import). Having clean formatting without line breaks or extra characters will make the data in that field more portable. This data may be exported, harvested, and/or eventually migrated. I figured that getting the extra stuff out now would save some headaches later. Having it look nice to humans is a plus. Thanks again! I will share the solution I implement when I get there. Erica *Erica Findley* Cataloging/Metadata Librarian Multnomah County Library Phone: 503.988.5466 eri...@multco.us www.multcolib.org On Mon, Nov 24, 2014 at 8:43 AM, Kyle Banerjee kyle.baner...@gmail.com wrote: As for formatting, this one is harder. But instead of trying to solve that, I wonder if you're sure it's worth doing. If you're only using the OCR to drive search of the scanned page images, why does it matter if there are some unnecessary line breaks in your OCR text? For simple keyword searches, it wouldn't. However if phrase or entity extraction is an issue, it would be beneficial to remove them. Regex strikes me as a quick and easy way to accomplish this on a large number of files. kyle
[CODE4LIB] Looking for a script to clean up OCR text files
Greetings, I am working on a project to digitize concert programs. These are the type of programs you get when attending a musical concert that list performers and details about the concert. Since these items are text heavy we have decided to use OCR software to output a text file that will enable full text searching in our platform. These text files are for the most part accurate, but often have unnecessary line breaks and pockets of extra characters and/or incorrect capitalization. I would like to pretty them up a little bit if possible. I am wondering if there is a script I can use on multiple files to clean these type of things up. I don't want to have the digitization staff manually edit each text file or have to open each one to run a macro in a text editor. I have been searching online and so far haven't found anything that will work for my situation. thanks in advance, *Erica Findley* Cataloging/Metadata Librarian Multnomah County Library Phone: 503.988.5466 eri...@multco.us www.multcolib.org