Re: [CODE4LIB] Providing Search Across PDFs
What about just a Google site search? -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Nathan Tallman Sent: Wednesday, February 20, 2013 12:54 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] Providing Search Across PDFs My institution is looking for ways to provide search across PDFs through our website. Specifically, PDFs linked from finding aids. Ideally searching within a collection's PDFs or possibly across all PDFs linked from all finding aids. We do not have a CMS or a digital repository. A digital repository is on the horizon, but it's a ways out and we need to offer the search sooner. I've looked into Swish-e but haven't had much luck getting anything off the ground. One way we know we can do this through our discovery layer VuFind, using it's ability to full-text index a website based on a sitemap (which would includes PDFs linked from finding aids). Facets could be created for collections, and we may be able to create a search box on the finding aid nav that searches specifically that collection. But, I'm not sure how scalable that solution is. The indexing agent cannot discern when a page was updated, so it has to re-scrape, everything, every-night. The impetus collection is going to have about over 1000 PDFs. And that's to start. Creating the index will start to take a long, long time. Does anyone have any ideas or know of any useful tools for this project? Doesn't have to be perfect, quick and dirty may work. (The OCR's dirty anyway :-) Thanks, Nathan
Re: [CODE4LIB] You *are* a coder. So what am I?
I dub thee...LIBRARIAN!! If it looks like a librarian, and talks like a librarian, and does librarian stuff, then I'd say it is one :) Michele -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Devon Sent: Thursday, February 14, 2013 10:10 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] You *are* a coder. So what am I? If you want to call yourself a librarian, just do it. There's no pope of librarianship to tell you otherwise. On Wed, Feb 13, 2013 at 7:22 PM, Maccabee Levine levi...@uwosh.edu wrote: Andromeda's talk this afternoon really struck a chord, as I shared with her afterwards, because I have the same issue from the other side of the fence. I'm among the 1/3 of the crowd today with a CS degree and and IT background (and no MLS). I've worked in libraries for years, but when I have a point to make about how technology can benefit instruction or reference or collection development, I generally preface it with I'm not a librarian, but I shouldn't have to be defensive about that.
Re: [CODE4LIB] Lib or Libe
Or, in the immortal words of Monty Python: No, no, it's spelt 'luxury yacht' but it's pronounced 'throat-wobbler mangrove'... http://www.youtube.com/watch?v=tyQvjKqXA0Y Michele -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Thomas Bennett Sent: Wednesday, February 13, 2013 11:18 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Lib or Libe After voting I am surprised at the results, its a library as in libe, not a leebrary as in lib, ryght or is that reeght or rit ?. Thomas or is it Thoomas you say tomato I say tomato pecan or pecan In these two examples maybe pronounce it as you wish or weesh or woosh, what ever. Support Requesthttp://portal.support.appstate.edu Thomas McMillan Grant Bennett Appalachian State University Operations Systems AnalystP O Box 32026 University LibraryBoone, North Carolina 28608 (828) 262 6587 Library Systems http://www.library.appstate.edu Confidentiality Notice: This communication constitutes an electronic communication within the meaning of the Electronic Communications Privacy Act, 18 U.S.C. Section 2510, and its disclosure is strictly limited to the recipient intended by the sender of this message. If you are not the intended recipient, any disclosure, copying, distribution or use of any of the information contained in or attached to this transmission is STRICTLY PROHIBITED. Please contact this office immediately by return e-mail or at 828-262-6587, and destroy the original transmission and its attachment(s), if any, if you are not the intended recipient. On Feb 13, 2013, at 11:08 AM, Fleming, Declan wrote: Hi - at the conference, there has been much foment about how to pronounce the end of code4lib. Please go to: https://docs.google.com/forms/d/1lseCc2gwQUXL6oC8aLB7N8YMRnjsl90SfPHAmX5EA_w/viewform and vote. D
Re: [CODE4LIB] Question abt the code4libwomen idea
Spot on, totally agree :) Michele -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Bess Sadler Sent: Tuesday, December 18, 2012 10:24 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Question abt the code4libwomen idea ...Having a policy in place (which was my only request in that original email, and which we now have, yay!) is a good idea regardless of whether any individual incident in the past meets anyone's individual criteria for harassment...These things are not really news-worthy individually. I would prefer instead to put energy into knowing how to respond to problematic behavior in the moment, how to discuss questions of privilege and inclusiveness without creating hostility, and how to make library technology more inclusive in general.
Re: [CODE4LIB] Question abt the code4libwomen idea
Much better to do it that way than on the list, IMHO. Then the list can get back to code :) It's possible that the ratio of idiots at a code4lib function is comparable to the ratio of idiots anywhere else (e.g., an ALA conference or SAA function or, heck, your basic office party). In that case, I submit that no special method of attack or treatment is required -- just the same approach used when one encounter jerks in any other area of one's life. Michele From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Jonathan Rochkind [rochk...@jhu.edu] Sent: Tuesday, December 18, 2012 7:14 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Question abt the code4libwomen idea ...Is this a good idea, or just a disaster trainwreck lying in wait? If it's a good idea, we could easily set up a wiki page where people can easily anonymously describe incidents (again, what I'm going for is NOT calling specific people out, but just giving us an idea of what it is that has happened that we're trying to stop from happening, you know?)...
[CODE4LIB] Job Opening - IT Analyst
The Syracuse University Libraries is seeking an Information Technology (IT) Analyst to support its growing development of customized web-based solutions for both patron facing and back-end administrative tools. Under the general direction of the Senior Information Technology Programmer/Analyst and in collaboration with library staff, the Information Technology (IT) Analyst will assist in architecting, designing, developing, and implementing customized, complex database driven technical solutions for the Syracuse University Library in an effort to provide intuitive interfaces for users and automate processes where necessary. This position is responsible for high-end web and mobile application development efforts for but not limited to: the Library website, various third party research tool customization, and grant funded projects. For more information and to apply go to: https://www.sujobopps.com/postings/47574
Re: [CODE4LIB] Gender Survey Summary and Results
I second this, in its entirety. Michele -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Roy Tennant Sent: Wednesday, December 05, 2012 4:35 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Gender Survey Summary and Results On Wed, Dec 5, 2012 at 12:57 PM, Rosalyn Metz rosalynm...@gmail.com wrote: Karen had the idea of creating a women Code4Lib IRC channel, maybe that can be a place to start. I understand the motivation to create a safe space for women, but please let's not do this. Separate but equal has never been shown to make progress toward equality, and I doubt this situation would be any different. I believe it would instead make things worse, by balkanizing the community rather than encouraging good behavior within a unified group. In other words, the solution will never be reached without active participation by men. Roy
Re: [CODE4LIB] Survey
I'm not sure that would work. We aren't interested in library staff, we're interested in the CODE4LIB community, yes? My manager doesn't know all the lists I subscribe to, or the communities I consider myself a member of, so I don't see any way for a library to report reliably on behalf of its staff. Pretty much by definition, if you want to know demographics for a community, you have to ask the members directly. Not to mention the question of including and other option for gender -- a library isn't likely to be able to determine that for its staff :) Michele On Tue, Nov 27, 2012 at 1:41 PM, Karen Coyle li...@kcoyle.net wrote: Joe, what I was hoping for was not a survey where individuals report on themselves, but a statistical sample of libraries where the library reports on its staff. That avoid the self-image issue, and the selection that individual reporting on self entails.
Re: [CODE4LIB] Easiest way to tag thousands of images
Good piece in yesterday's New York Times on this very topic, about a project at Princeton University: http://www.nytimes.com/2012/11/20/science/for-web-images-creating-new-technology-to-seek-and-find.html Of course, you have to be careful: http://www.news.com.au/entertainment/television/petreaus-sex-scandal-tv-station-gets-caught-out-by-google-image-search Michele -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Joe Hourcle Sent: Tuesday, November 20, 2012 7:10 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Easiest way to tag thousands of images On Nov 20, 2012, at 2:54 PM, Kyle Banerjee wrote: My real question is whether anyone has come up with a really good way to assign metadata to thousands of photos, preferably in batch fashion? Thanks, There's a whole field of what they call 'computer vision', which is basically looking for things in images. (face detection is just a small subset of it): http://en.wikipedia.org/wiki/Computer_vision Most of the ones that I know of work with very constrained images (they're all pictures of the sun, at a known pixel scale, so they know what size/scale the items are that they're looking for) The various 'find similar image' engines might be useful to do some sort of clustering of the images to make them easier to process. (eg, extract landscapes vs. buildings vs. people) Wikipedia has a list of various implementations: http://en.wikipedia.org/wiki/List_of_CBIR_engines -Joe
Re: [CODE4LIB] Easiest way to tag thousands of images
Oops, bad link on the second one: http://www.news.com.au/entertainment/television/petreaus-sex-scandal-tv-station-gets-caught-out-by-google-image-search/story-e6frfmyi-1226517360280 -Original Message- From: Michele R Combs Sent: Wednesday, November 21, 2012 9:29 AM To: 'Code for Libraries' Subject: RE: [CODE4LIB] Easiest way to tag thousands of images Good piece in yesterday's New York Times on this very topic, about a project at Princeton University: http://www.nytimes.com/2012/11/20/science/for-web-images-creating-new-technology-to-seek-and-find.html Of course, you have to be careful: http://www.news.com.au/entertainment/television/petreaus-sex-scandal-tv-station-gets-caught-out-by-google-image-search Michele -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Joe Hourcle Sent: Tuesday, November 20, 2012 7:10 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Easiest way to tag thousands of images On Nov 20, 2012, at 2:54 PM, Kyle Banerjee wrote: My real question is whether anyone has come up with a really good way to assign metadata to thousands of photos, preferably in batch fashion? Thanks, There's a whole field of what they call 'computer vision', which is basically looking for things in images. (face detection is just a small subset of it): http://en.wikipedia.org/wiki/Computer_vision Most of the ones that I know of work with very constrained images (they're all pictures of the sun, at a known pixel scale, so they know what size/scale the items are that they're looking for) The various 'find similar image' engines might be useful to do some sort of clustering of the images to make them easier to process. (eg, extract landscapes vs. buildings vs. people) Wikipedia has a list of various implementations: http://en.wikipedia.org/wiki/List_of_CBIR_engines -Joe
Re: [CODE4LIB] Using dbpedia to generate EAC-CPF collections
Wow. That's pretty spiff! I'd love to see your Roman Empire SNAC, can you send me the info? Michele -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Ethan Gruber Sent: Wednesday, October 03, 2012 11:04 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] Using dbpedia to generate EAC-CPF collections Hi all, In the last few weeks, I have undertaken a project of EAC-CPF stubs using dbpedia and VIAF data for the Roman emperors and their relations. There's a lot of great information available through dbpedia, and since it's available in RDF, I put together a PHP script that can start at one point in dbpedia (e.g., http://dbpedia.org/resource/Augustus) and traverse through its relations to create a network of stubs using links to parents, children, spouses, influences, successors, and predecessors provided in the RDF. Left unchecked, the script would crawl forward through the Byzantine period to spread laterally (chronologically speaking) to generate a network of the ruling hierarchy of the West up to the modern period. It also goes backwards to the successors of Alexander the Great. For all I know, it goes back through all of the Egyptian dynasties to Narmer ca. 3000 BC, but I haven't let the script go that far. The script is fairly generalizable, and can begin at any dbpedia resource. It's available at https://github.com/ewg118/xEAC/blob/master/misc/dbpedia-to-eac.php I should also note that this is a work in progress. To execute the script, you'll need to place a temp folder in the same place you download/execute it (for writing EAC records). At a glance, here's what it does: -Creates nameEntries for all of the names available in various languages in dbpedia -If a VIAF ID is available in the RDF, the script will pull some alternate record IDs from VIAF, as well as birth and death dates -Can pull in subjects, occupations, and related resources on the web -Generate corporate/personal/family relations given the parents/children/spouses/influences/successors/predecessors/dynasties linked in dbpedia. These relations are added into an array which continually processes until presumably it reaches the end of time. -You can specify an end record to attempt to break this chain, but I cannot guarantee that it'll work. Anastasius (emperor of Rome ca. 500 AD) does actually successfully terminate the Augustus chain. -Import birth and death places (and associated birth and death dates, if available) I think that these stubs are a good starting point for handing off the management of EAC content to subject specialists who can add chronological and geographical context. I wrote a bit more about this script and the process applied to xEAC, an XForms-based engine for creating, editing, managing, and publishing EAC-CPF collections at http://eaditor.blogspot.com/2012/10/using-dbpedia-to-jumpstart-eac-cpf.html There's a prototype collection of the Roman Empire; if anyone is interested in taking a look at it, drop me a line off the list. Ethan
Re: [CODE4LIB] visualize website
Wow, what a great site! Have bookmarked for future exploration, thanks! Michele -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of miles stauffer Sent: Thursday, August 30, 2012 1:04 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] visualize website Is this what you are looking for? http://selection.datavisualization.ch/ I have found this site to be fantastic. I am not 100% sure if this answers your question. Please let me know if this is not what you are looking for. miles On Thu, Aug 30, 2012 at 8:59 AM, Rosalyn Metz rosalynm...@gmail.com wrote: I'd be interested in hearing the answer to this too. We have a bunch of files (pdfs, docs, simple html) on a server that we need to migrate to a new server. It would be great to know what the heck is on the old server and what would be amazing is to see how often they get used. On Thu, Aug 30, 2012 at 11:52 AM, Shearer, Timothy J tshea...@email.unc.edu wrote: Hi Folks, We're doing a survey of our web content and I'm looking for visualization tools. The content is on a redhat box served up by apache. tree gives a nice, but hard to interact with, view of the file system. Anyone recommend a tool or set of tools they like? Thanks, Tim
Re: [CODE4LIB] Browser Support
Of course, rapid changes in technology mean that something might not work in *newer* versions, but usually it's older versions that you have to worry about. So from a testing/development perspective having such a policy makes a lot of sense. It sets bounds on what you have to test and lets you know what cool new features you can exploit. For example, say you're responsible for maintaining a library website and you want to add some neat new functionality that isn't supported in, say, IE6; if your policy says you only support IE7 or later then it makes it easy to know that that's OK (and you have something to back up your decision if a user complains!). Or maybe you're in the testing phase and working on Safari; if your policy says you only support Safari 5 or later, you don't have to test in earlier versions. Michele -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Ron Gilmour Sent: Thursday, August 02, 2012 10:29 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Browser Support This strikes me as a strange thing to have a policy about. Between the rapid development cycles of Chrome and Firefox and the ever-expanding diversity of mobile platforms and browsers, I don't see how such a policy could possibly be kept current and meaningful. Ron Gilmour Ithaca College Library
Re: [CODE4LIB] LoC job opening ???
Are the cats classified? -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Simon Spero Sent: Monday, July 09, 2012 1:56 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] LoC job opening ??? On Jul 9, 2012 1:27 PM, Joshua Gomez jngo...@gwu.edu wrote: WE NEED A CAT LOVER WHO IS ALSO A FEDERAL EMPLOYEE TO DO THIS JOB! Must have active TS/SCI clearance with FS Poly. All applicants must complete the attached 20 page KSA.
Re: [CODE4LIB] Studying the email list (Charcuterie Spectrum)
Perhaps spam spam spam spam spam spam spam baked beans egg and spam? Michele -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Kevin S. Clarke Sent: Tuesday, June 05, 2012 4:02 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Studying the email list (Charcuterie Spectrum) On Tue, Jun 5, 2012 at 3:55 PM, BWS Johnson abesottedphoe...@yahoo.com wrote: Alas, bologna as the seal of disapproval might fall a bit short. While one might jump to proffer spam in its place, Hawai'ians quite like spam, leaving us all in a bit of a quandary.
Re: [CODE4LIB] Studying the email list (Charcuterie Spectrum)
I dunno, it's hard to imagine anything that's been sitting on a bar stool since before I was born as being remotely attractive. But that might just be because I'm old. Well, old-ish. Michele -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Mark A. Matienzo Sent: Tuesday, June 05, 2012 4:17 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Studying the email list (Charcuterie Spectrum) On Tue, Jun 5, 2012 at 4:10 PM, Becky Yoose b.yo...@gmail.com wrote: We need a meat that is disapproved of universally. May I suggest pickled pig's ears that have been sitting in a jar on a bar counter since you've been born? There are cultural assumptions in this disapproval. I suggest you retract this proposal immediately.
[CODE4LIB] Proquest dissertation XML?
Hi all -- Has anyone written an XSL style sheet (or other script) to transform ProQuest's dissertation metadata XML into (a) Dublin Core or (b) MARCXML? Thanks Michele +++ Michele Combs Lead Archivist Special Collections Research Center Syracuse University 315-443-2081 mrrot...@syr.edu scrc.syr.edu library-blog.syr.edu/scrc
Re: [CODE4LIB] Proquest dissertation XML?
Thanks, Terry - I posted this on behalf of our Scholarly Communications Librarian, so will forward the info to her. Michele -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Reese, Terry Sent: Thursday, May 10, 2012 11:19 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Proquest dissertation XML? I actually wrote a simple one for someone else and include it in MarcEdit, or, for download to MarcEdit from the xslt registry the program uses (wish I would have been paying attention realizing someone else did this work) -- but I've attached. This is fairly simplistic, but does the dissertation xml to marcxml. --tr -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Nick Ruest Sent: Thursday, May 10, 2012 8:14 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Proquest dissertation XML? Hi Michele, This might be a helpful start: http://journal.code4lib.org/articles/1647 -nruest On 12-05-10 11:11 AM, Michele R Combs wrote: Hi all -- Has anyone written an XSL style sheet (or other script) to transform ProQuest's dissertation metadata XML into (a) Dublin Core or (b) MARCXML? Thanks Michele +++ Michele Combs Lead Archivist Special Collections Research Center Syracuse University 315-443-2081 mrrot...@syr.edu scrc.syr.edu library-blog.syr.edu/scrc
Re: [CODE4LIB] Proquest dissertation XML?
Very much so! Thanks -- Michele -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Nick Ruest Sent: Thursday, May 10, 2012 11:14 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Proquest dissertation XML? Hi Michele, This might be a helpful start: http://journal.code4lib.org/articles/1647 -nruest On 12-05-10 11:11 AM, Michele R Combs wrote: Hi all -- Has anyone written an XSL style sheet (or other script) to transform ProQuest's dissertation metadata XML into (a) Dublin Core or (b) MARCXML? Thanks Michele +++ Michele Combs Lead Archivist Special Collections Research Center Syracuse University 315-443-2081 mrrot...@syr.edu scrc.syr.edu library-blog.syr.edu/scrc