[CODE4LIB] MassLNC RFP Notice of Deadline Extension

2011-03-31 Thread Kathy Lussier
Hi all, Please excuse any cross-postings. The Massachusetts Library Network Cooperative is extending the deadline for its Request for Proposals (RFP) for Evergreen enhancements. The new deadline is 5 p.m. (EDT) today (March 31, 2011.) The RFP is available at http://masslnc.cwmars.org/node/2301

[CODE4LIB] regexp for LCC?

2011-03-31 Thread Jonathan Rochkind
Does anyone have a good regular expression that will match all legal LC Call Numbers from the LC Classified Schedule, but will generally not match things that could not possibly be an LC Call Number from the LC Classified Schedule? In particular, I need it to NOT match an MLC call number,

Re: [CODE4LIB] regexp for LCC?

2011-03-31 Thread Tod Olson
Check the regexp that Google uses in their call number normalization: http://code.google.com/p/library-callnumber-lc/wiki/Home You may want to remove the prefix part, and allow for a fourth cutter. The folks at UNC pointed me to this a few months ago. -Tod On Mar 31, 2011, at 11:29

Re: [CODE4LIB] regexp for LCC?

2011-03-31 Thread Jonathan Rochkind
Thanks, that looks good! It's hosted on Google Code, but I don't think that code is anything Google uses, it looks like it's from our very own Bill Dueber. On 3/31/2011 12:38 PM, Tod Olson wrote: Check the regexp that Google uses in their call number normalization:

Re: [CODE4LIB] regexp for LCC?

2011-03-31 Thread Jonathan Rochkind
Except now I wonder if those annoying MLCS call numbers might actually be properly MATCHED by this regex, when I need em excluded. They are annoying _similar_ to a classified call number. Well, one way to find out. And the reason this matters is to try and use an LCC to map to a 'discipline'

Re: [CODE4LIB] regexp for LCC?

2011-03-31 Thread Keith Jenkins
The Google Code regex looks like it will accept any 1-3 letters at the start of the call number. But LCC has no I, O, W, X, or Y classifications. So you might want to use something more like ^[A-HJ-NP-VZ] at the start of the regex. Also, there are only a few major classifications that use three

Re: [CODE4LIB] regexp for LCC?

2011-03-31 Thread Doran, Michael D
Hi Jonathan, Although designed for a different purpose, you might want to take a look at the regex in the LC call number sorting utilities on this page: http://rocky.uta.edu/doran/sortlc/ Note that unparsable call numbers printed to STDERR with error message. So you could run it against a

Re: [CODE4LIB] regexp for LCC?

2011-03-31 Thread Naomi Dushay
You could also try to use the code I put in SolrMarc utilities classes ha ha ha. - Naomi On Mar 31, 2011, at 10:25 AM, Keith Jenkins wrote: The Google Code regex looks like it will accept any 1-3 letters at the start of the call number. But LCC has no I, O, W, X, or Y classifications. So

[CODE4LIB] digital librarian job description

2011-03-31 Thread Eric Lease Morgan
Below is an abbreviated digital librarian job description -- a grant-funded temporary position here at Notre Dame: The overall goal of the Vector Control Development Network (VCDN) is to develop an analytical framework for the evaluation of the transmission of vector borne diseases to

[CODE4LIB] digital preservation management workshop

2011-03-31 Thread Eric Lease Morgan
[Forwarded on behalf of Nancy McGovern nancy...@umich.edu --ELM] Call for Applications We are very pleased that our colleagues at the University at Albany, SUNY will host the five-day Digital Preservation Management workshop this June in Albany, New York. Application form available on April

[CODE4LIB] XC NCIP Toolkit Connectors available for Symphony and Voyager ILS

2011-03-31 Thread Cook, Randall
Here is exciting news regarding resource sharing and discovery using the NCIP protocol. The open source community of software developers working on and supporting the eXtensible Catalog's (XC) NCIP Toolkit, http://code.google.com/p/xcncip2toolkit/ is pleased to announce the release of new

[CODE4LIB] techniques for parsing legacy library data

2011-03-31 Thread Thomale, Jason
Hey all, I 3 today's LCC thread, and ones like it. It seems like there's a ton of knowledge out there (buried) about parsing various pieces of library data like this, but I haven't really seen a concerted effort to log/organize this info in one place. It seems like such a thing could be a

Re: [CODE4LIB] techniques for parsing legacy library data

2011-03-31 Thread Simon Spero
I strongly suggest taking a look at GATE (http://gate.ac.uk) and UIMA ( http://uima-framework.sourceforge.net/ ). GATE can use UIMA workflows as processing resource. UIMA can use GATE workflows as processing resources. Don't cross the streams... Simon On Thu, Mar 31, 2011 at 5:22 PM,

Re: [CODE4LIB] techniques for parsing legacy library data

2011-03-31 Thread Doran, Michael D
Hi Jason, I started a page on the wiki: http://wiki.code4lib.org/index.php/Parsing_Library_Data Cool idea. I added a link under the Title section to a small code snippet for parsing titles to determine the number of nonfiling characters (for when converting non-MARC data to MARC). --

Re: [CODE4LIB] MARC magic for file

2011-03-31 Thread William Denton
On 28 March 2011, Ford, Kevin wrote: I couldn't get Simon's MARC 21 Magic file to work. Among other issues, I received line too long errors. But, since I've been curious about this for sometime, I figured I'd take a whack at it myself. Try this: This is very nice! Thanks. I tried it on