Re: [CODE4LIB] Providing Search Across PDFs

2013-02-20 Thread Michele R Combs
What about just a Google site search?

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Nathan 
Tallman
Sent: Wednesday, February 20, 2013 12:54 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: [CODE4LIB] Providing Search Across PDFs

My institution is looking for ways to provide search across PDFs through our 
website. Specifically, PDFs linked from finding aids. Ideally searching within 
a collection's PDFs or possibly across all PDFs linked from all finding aids.

We do not have a CMS or a digital repository. A digital repository is on the 
horizon, but it's a ways out and we need to offer the search sooner.
I've looked into Swish-e but haven't had much luck getting anything off the 
ground.

One way we know we can do this through our discovery layer VuFind, using it's 
ability to full-text index a website based on a sitemap (which would includes 
PDFs linked from finding aids). Facets could be created for  collections, and 
we may be able to create a search box on the finding aid nav that searches 
specifically that collection.

But, I'm not sure how scalable that solution is. The indexing agent cannot 
discern when a page was updated, so it has to re-scrape, everything, 
every-night. The impetus collection is going to have about over
1000 PDFs. And that's to start. Creating the index will start to take a long, 
long time.

Does anyone have any ideas or know of any useful tools for this project?
Doesn't have to be perfect, quick and dirty may work. (The OCR's dirty anyway 
:-)

Thanks,
Nathan


Re: [CODE4LIB] You *are* a coder. So what am I?

2013-02-14 Thread Michele R Combs
I dub thee...LIBRARIAN!!

If it looks like a librarian, and talks like a librarian, and does librarian 
stuff, then I'd say it is one :)

Michele

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Devon
Sent: Thursday, February 14, 2013 10:10 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] You *are* a coder. So what am I?

If you want to call yourself a librarian, just do it. There's no pope of 
librarianship to tell you otherwise.


On Wed, Feb 13, 2013 at 7:22 PM, Maccabee Levine levi...@uwosh.edu wrote:

 Andromeda's talk this afternoon really struck a chord, as I shared 
 with her afterwards, because I have the same issue from the other side of the 
 fence.
  I'm among the 1/3 of the crowd today with a CS degree and and IT 
 background (and no MLS).  I've worked in libraries for years, but when 
 I have a point to make about how technology can benefit instruction or 
 reference or collection development, I generally preface it with I'm 
 not a librarian, but  I shouldn't have to be defensive about that.


Re: [CODE4LIB] Lib or Libe

2013-02-13 Thread Michele R Combs
Or, in the immortal words of Monty Python:  No, no, it's spelt 'luxury yacht' 
but it's pronounced 'throat-wobbler mangrove'...

http://www.youtube.com/watch?v=tyQvjKqXA0Y 

Michele

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Thomas 
Bennett
Sent: Wednesday, February 13, 2013 11:18 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Lib or Libe

After voting I am surprised at the results, its a library as in libe, not a 
leebrary as in lib, ryght or is that reeght or rit ?.

Thomas or is it Thoomas

you say tomato I say tomato
pecan or pecan
In these two examples maybe pronounce it as you wish or weesh or woosh, what 
ever.



Support Requesthttp://portal.support.appstate.edu   
   

Thomas McMillan Grant Bennett   Appalachian State University
Operations  Systems AnalystP O Box 32026
University LibraryBoone, North Carolina 28608
(828) 262 6587
Library Systems  http://www.library.appstate.edu


Confidentiality Notice:
This communication constitutes an electronic communication within the meaning 
of the Electronic Communications Privacy Act, 18 U.S.C. Section 2510, and its 
disclosure is strictly limited to the recipient intended by the sender of this 
message.  If you are not the intended recipient, any disclosure, copying, 
distribution or use of any of the information contained in or attached to this 
transmission is STRICTLY PROHIBITED.  Please contact this office immediately by 
return e-mail or at 828-262-6587, and destroy the original transmission and its 
attachment(s), if any, if you are not the intended recipient.

On Feb 13, 2013, at 11:08 AM, Fleming, Declan wrote:

 Hi - at the conference, there has been much foment about how to pronounce the 
 end of code4lib.
 
 Please go to:
 https://docs.google.com/forms/d/1lseCc2gwQUXL6oC8aLB7N8YMRnjsl90SfPHAmX5EA_w/viewform
 
 and vote.
 
 D


Re: [CODE4LIB] Question abt the code4libwomen idea

2012-12-19 Thread Michele R Combs
Spot on, totally agree :)

Michele

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Bess 
Sadler
Sent: Tuesday, December 18, 2012 10:24 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Question abt the code4libwomen idea

...Having a policy in place (which was my only request in that original email, 
and which we now have, yay!) is a good idea regardless of whether any 
individual incident in the past meets anyone's individual criteria for 
harassment...These things are not really news-worthy individually. I would 
prefer instead to put energy into knowing how to respond to problematic 
behavior in the moment, how to discuss questions of privilege and inclusiveness 
without creating hostility, and how to make library technology more inclusive 
in general. 


Re: [CODE4LIB] Question abt the code4libwomen idea

2012-12-18 Thread Michele R Combs
Much better to do it that way than on the list, IMHO.  Then the list can get 
back to code :)

It's possible that the ratio of idiots at a code4lib function is comparable to 
the ratio of idiots anywhere else (e.g., an ALA conference or SAA function or, 
heck, your basic office party).  In that case, I submit that no special method 
of attack or treatment is required -- just the same approach used when one 
encounter jerks in any other area of one's life.

Michele

From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Jonathan 
Rochkind [rochk...@jhu.edu]
Sent: Tuesday, December 18, 2012 7:14 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Question abt the code4libwomen idea

...Is this a good idea, or just a disaster trainwreck lying in wait? If
it's a good idea, we could easily set up a wiki page where people can
easily anonymously describe incidents (again, what I'm going for is NOT
calling specific people out, but just giving us an idea of what it is
that has happened that we're trying to stop from happening, you know?)...


[CODE4LIB] Job Opening - IT Analyst

2012-12-11 Thread Michele R Combs
The Syracuse University Libraries is seeking an Information Technology (IT) 
Analyst to support its growing development of customized web-based solutions 
for both patron facing and back-end administrative tools.  Under the general 
direction of the Senior Information Technology Programmer/Analyst and in 
collaboration with library staff, the Information Technology (IT) Analyst will 
assist in architecting, designing, developing, and implementing customized, 
complex database driven technical solutions for the Syracuse University Library 
in an effort to provide intuitive interfaces for users and automate processes 
where necessary. This position is responsible for high-end web and mobile 
application development efforts for but not limited to: the Library website, 
various third party research tool customization, and grant funded projects.

For more information and to apply go to:
https://www.sujobopps.com/postings/47574


Re: [CODE4LIB] Gender Survey Summary and Results

2012-12-05 Thread Michele R Combs
I second this, in its entirety.

Michele

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Roy 
Tennant
Sent: Wednesday, December 05, 2012 4:35 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Gender Survey Summary and Results

On Wed, Dec 5, 2012 at 12:57 PM, Rosalyn Metz rosalynm...@gmail.com wrote:
 Karen had the idea of creating a women Code4Lib IRC channel, maybe 
 that can be a place to start.

I understand the motivation to create a safe space for women, but please 
let's not do this. Separate but equal has never been shown to make progress 
toward equality, and I doubt this situation would be any different. I believe 
it would instead make things worse, by balkanizing the community rather than 
encouraging good behavior within a unified group. In other words, the solution 
will never be reached without active participation by men.
Roy


Re: [CODE4LIB] Survey

2012-11-27 Thread Michele R Combs
I'm not sure that would work.  We aren't interested in library staff, we're 
interested in the CODE4LIB community, yes?  My manager doesn't know all the 
lists I subscribe to, or the communities I consider myself a member of, so I 
don't see any way for a library to report reliably on behalf of its staff.  
Pretty much by definition, if you want to know demographics for a community, 
you have to ask the members directly.

Not to mention the question of including and other option for gender -- a 
library isn't likely to be able to determine that for its staff :)

Michele

On Tue, Nov 27, 2012 at 1:41 PM, Karen Coyle li...@kcoyle.net wrote:
 Joe, what I was hoping for was not a survey where individuals report 
 on themselves, but a statistical sample of libraries where the library 
 reports on its staff. That avoid the self-image issue, and the 
 selection that individual reporting on self entails.


Re: [CODE4LIB] Easiest way to tag thousands of images

2012-11-21 Thread Michele R Combs
Good piece in yesterday's New York Times on this very topic, about a project at 
Princeton University:

http://www.nytimes.com/2012/11/20/science/for-web-images-creating-new-technology-to-seek-and-find.html

Of course, you have to be careful:

http://www.news.com.au/entertainment/television/petreaus-sex-scandal-tv-station-gets-caught-out-by-google-image-search
 

Michele

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Joe 
Hourcle
Sent: Tuesday, November 20, 2012 7:10 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Easiest way to tag thousands of images

On Nov 20, 2012, at 2:54 PM, Kyle Banerjee wrote:

 My real question is whether anyone has come up with a really good way 
 to assign metadata to thousands of photos, preferably in batch 
 fashion? Thanks,

There's a whole field of what they call 'computer vision', which is basically 
looking for things in images.  (face detection is just a small subset of it):

http://en.wikipedia.org/wiki/Computer_vision

Most of the ones that I know of work with very constrained images (they're all 
pictures of the sun, at a known pixel scale, so they know what size/scale the 
items are that they're looking for)


The various 'find similar image' engines might be useful to do some sort of 
clustering of the images to make them easier to process.  (eg, extract 
landscapes vs. buildings vs. people)

Wikipedia has a list of various implementations:

http://en.wikipedia.org/wiki/List_of_CBIR_engines

-Joe


Re: [CODE4LIB] Easiest way to tag thousands of images

2012-11-21 Thread Michele R Combs
Oops, bad link on the second one:

http://www.news.com.au/entertainment/television/petreaus-sex-scandal-tv-station-gets-caught-out-by-google-image-search/story-e6frfmyi-1226517360280
 

-Original Message-
From: Michele R Combs 
Sent: Wednesday, November 21, 2012 9:29 AM
To: 'Code for Libraries'
Subject: RE: [CODE4LIB] Easiest way to tag thousands of images

Good piece in yesterday's New York Times on this very topic, about a project at 
Princeton University:

http://www.nytimes.com/2012/11/20/science/for-web-images-creating-new-technology-to-seek-and-find.html

Of course, you have to be careful:

http://www.news.com.au/entertainment/television/petreaus-sex-scandal-tv-station-gets-caught-out-by-google-image-search
 

Michele

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Joe 
Hourcle
Sent: Tuesday, November 20, 2012 7:10 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Easiest way to tag thousands of images

On Nov 20, 2012, at 2:54 PM, Kyle Banerjee wrote:

 My real question is whether anyone has come up with a really good way 
 to assign metadata to thousands of photos, preferably in batch 
 fashion? Thanks,

There's a whole field of what they call 'computer vision', which is basically 
looking for things in images.  (face detection is just a small subset of it):

http://en.wikipedia.org/wiki/Computer_vision

Most of the ones that I know of work with very constrained images (they're all 
pictures of the sun, at a known pixel scale, so they know what size/scale the 
items are that they're looking for)


The various 'find similar image' engines might be useful to do some sort of 
clustering of the images to make them easier to process.  (eg, extract 
landscapes vs. buildings vs. people)

Wikipedia has a list of various implementations:

http://en.wikipedia.org/wiki/List_of_CBIR_engines

-Joe


Re: [CODE4LIB] Using dbpedia to generate EAC-CPF collections

2012-10-03 Thread Michele R Combs
Wow.  That's pretty spiff!  I'd love to see your Roman Empire SNAC, can you 
send me the info?

Michele

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Ethan 
Gruber
Sent: Wednesday, October 03, 2012 11:04 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: [CODE4LIB] Using dbpedia to generate EAC-CPF collections

Hi all,

In the last few weeks, I have undertaken a project of EAC-CPF stubs using 
dbpedia and VIAF data for the Roman emperors and their relations.  There's a 
lot of great information available through dbpedia, and since it's available in 
RDF, I put together a PHP script that can start at one point in dbpedia (e.g., 
http://dbpedia.org/resource/Augustus) and traverse through its relations to 
create a network of stubs using links to parents, children, spouses, 
influences, successors, and predecessors provided in the RDF.  Left unchecked, 
the script would crawl forward through the Byzantine period to spread laterally 
(chronologically speaking) to generate a network of the ruling hierarchy of the 
West up to the modern period.  It also goes backwards to the successors of 
Alexander the Great.  For all I know, it goes back through all of the Egyptian 
dynasties to Narmer ca. 3000 BC, but I haven't let the script go that far.

The script is fairly generalizable, and can begin at any dbpedia resource.
It's available at
https://github.com/ewg118/xEAC/blob/master/misc/dbpedia-to-eac.php

I should also note that this is a work in progress.  To execute the script, 
you'll need to place a temp folder in the same place you download/execute it 
(for writing EAC records).

At a glance, here's what it does:

-Creates nameEntries for all of the names available in various languages in 
dbpedia -If a VIAF ID is available in the RDF, the script will pull some 
alternate record IDs from VIAF, as well as birth and death dates -Can pull in 
subjects, occupations, and related resources on the web -Generate 
corporate/personal/family relations given the 
parents/children/spouses/influences/successors/predecessors/dynasties
linked in dbpedia.  These relations are added into an array which continually 
processes until presumably it reaches the end of time.
-You can specify an end record to attempt to break this chain, but I cannot 
guarantee that it'll work.  Anastasius (emperor of Rome ca. 500 AD) does 
actually successfully terminate the Augustus chain.
-Import birth and death places (and associated birth and death dates, if
available)

I think that these stubs are a good starting point for handing off the 
management of EAC content to subject specialists who can add chronological and 
geographical context.  I wrote a bit more about this script and the process 
applied to xEAC, an XForms-based engine for creating, editing, managing, and 
publishing EAC-CPF collections at 
http://eaditor.blogspot.com/2012/10/using-dbpedia-to-jumpstart-eac-cpf.html

There's a prototype collection of the Roman Empire; if anyone is interested in 
taking a look at it, drop me a line off the list.

Ethan


Re: [CODE4LIB] visualize website

2012-08-30 Thread Michele R Combs
Wow, what a great site!  Have bookmarked for future exploration, thanks!

Michele

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of miles 
stauffer
Sent: Thursday, August 30, 2012 1:04 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] visualize website

Is this what you are looking for?

http://selection.datavisualization.ch/

I have found this site to be fantastic. I am not 100% sure if this answers your 
question. Please let me know if this is not what you are looking for.


miles



On Thu, Aug 30, 2012 at 8:59 AM, Rosalyn Metz rosalynm...@gmail.com wrote:

 I'd be interested in hearing the answer to this too.  We have a bunch 
 of files (pdfs, docs, simple html) on a server that we need to migrate 
 to a new server.  It would be great to know what the heck is on the 
 old server and what would be amazing is to see how often they get used.

 On Thu, Aug 30, 2012 at 11:52 AM, Shearer, Timothy J  
 tshea...@email.unc.edu
  wrote:

  Hi Folks,
 
  We're doing a survey of our web content and I'm looking for 
  visualization tools.  The content is on a redhat box served up by apache.
 
  tree gives a nice, but hard to interact with, view of the file system.
 
  Anyone recommend a tool or set of tools they like?
 
  Thanks,
  Tim
 



Re: [CODE4LIB] Browser Support

2012-08-02 Thread Michele R Combs
Of course, rapid changes in technology mean that something might not work in 
*newer* versions, but usually it's older versions that you have to worry about. 
 So from a testing/development perspective having such a policy makes a lot of 
sense.  It sets bounds on what you have to test and lets you know what cool new 
features you can exploit.  For example, say you're responsible for maintaining 
a library website and you want to add some neat new functionality that isn't 
supported in, say, IE6; if your policy says you only support IE7 or later then 
it makes it easy to know that that's OK (and you have something to back up your 
decision if a user complains!).  Or maybe you're in the testing phase and 
working on Safari; if your policy says you only support Safari 5 or later, you 
don't have to test in earlier versions.

Michele

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Ron 
Gilmour
Sent: Thursday, August 02, 2012 10:29 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Browser Support

This strikes me as a strange thing to have a policy about. Between the rapid 
development cycles of Chrome and Firefox and the ever-expanding diversity of 
mobile platforms and browsers, I don't see how such a policy could possibly be 
kept current and meaningful.

Ron Gilmour
Ithaca College Library


Re: [CODE4LIB] LoC job opening ???

2012-07-09 Thread Michele R Combs
Are the cats classified?

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Simon 
Spero
Sent: Monday, July 09, 2012 1:56 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] LoC job opening ???

On Jul 9, 2012 1:27 PM, Joshua Gomez jngo...@gwu.edu wrote:

 WE NEED A CAT LOVER WHO IS ALSO A FEDERAL EMPLOYEE TO DO THIS JOB!

Must have active TS/SCI clearance with FS Poly.

All applicants must complete the attached 20 page KSA.


Re: [CODE4LIB] Studying the email list (Charcuterie Spectrum)

2012-06-05 Thread Michele R Combs
Perhaps spam  spam spam spam spam spam spam baked beans egg and spam?

Michele

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Kevin 
S. Clarke
Sent: Tuesday, June 05, 2012 4:02 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Studying the email list (Charcuterie Spectrum)

On Tue, Jun 5, 2012 at 3:55 PM, BWS Johnson abesottedphoe...@yahoo.com wrote:

 Alas, bologna as the seal of disapproval might fall a bit short. While one 
 might
 jump to proffer spam in its place, Hawai'ians quite like spam, leaving us all 
 in a 
 bit of a quandary.


Re: [CODE4LIB] Studying the email list (Charcuterie Spectrum)

2012-06-05 Thread Michele R Combs
I dunno, it's hard to imagine anything that's been sitting on a bar stool since 
before I was born as being remotely attractive.  But that might just be because 
I'm old.  Well, old-ish.

Michele

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Mark A. 
Matienzo
Sent: Tuesday, June 05, 2012 4:17 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Studying the email list (Charcuterie Spectrum)

On Tue, Jun 5, 2012 at 4:10 PM, Becky Yoose b.yo...@gmail.com wrote:

 We need a meat that is disapproved of universally. May I suggest 
 pickled pig's ears that have been sitting in a jar on a bar counter 
 since you've been born?

There are cultural assumptions in this disapproval. I suggest you retract this 
proposal immediately.


[CODE4LIB] Proquest dissertation XML?

2012-05-10 Thread Michele R Combs
Hi all --

Has anyone written an XSL style sheet (or other script) to transform ProQuest's 
dissertation metadata XML into (a) Dublin Core or (b) MARCXML?

Thanks

Michele

+++
Michele Combs
Lead Archivist
Special Collections Research Center
Syracuse University
315-443-2081
mrrot...@syr.edu   
scrc.syr.edu 
library-blog.syr.edu/scrc


Re: [CODE4LIB] Proquest dissertation XML?

2012-05-10 Thread Michele R Combs
Thanks, Terry - I posted this on behalf of our Scholarly Communications 
Librarian, so will forward the info to her.

Michele 

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Reese, 
Terry
Sent: Thursday, May 10, 2012 11:19 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Proquest dissertation XML?

I actually wrote a simple one for someone else and include it in MarcEdit, or, 
for download to MarcEdit from the xslt registry the program uses (wish I would 
have been paying attention realizing someone else did this work) -- but I've 
attached.  This is fairly simplistic, but does the dissertation xml to marcxml.

--tr



-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Nick 
Ruest
Sent: Thursday, May 10, 2012 8:14 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Proquest dissertation XML?

Hi Michele,

This might be a helpful start: http://journal.code4lib.org/articles/1647

-nruest

On 12-05-10 11:11 AM, Michele R Combs wrote:
 Hi all --

 Has anyone written an XSL style sheet (or other script) to transform 
 ProQuest's dissertation metadata XML into (a) Dublin Core or (b) MARCXML?

 Thanks

 Michele

 +++
 Michele Combs
 Lead Archivist
 Special Collections Research Center
 Syracuse University
 315-443-2081
 mrrot...@syr.edu
 scrc.syr.edu
 library-blog.syr.edu/scrc


Re: [CODE4LIB] Proquest dissertation XML?

2012-05-10 Thread Michele R Combs
Very much so!  Thanks --

Michele

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Nick 
Ruest
Sent: Thursday, May 10, 2012 11:14 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Proquest dissertation XML?

Hi Michele,

This might be a helpful start: http://journal.code4lib.org/articles/1647

-nruest

On 12-05-10 11:11 AM, Michele R Combs wrote:
 Hi all --

 Has anyone written an XSL style sheet (or other script) to transform 
 ProQuest's dissertation metadata XML into (a) Dublin Core or (b) MARCXML?

 Thanks

 Michele

 +++
 Michele Combs
 Lead Archivist
 Special Collections Research Center
 Syracuse University
 315-443-2081
 mrrot...@syr.edu
 scrc.syr.edu
 library-blog.syr.edu/scrc