Hi all,
Please excuse any cross-postings. The Massachusetts Library Network
Cooperative is extending the deadline for its Request for Proposals (RFP)
for Evergreen enhancements. The new deadline is 5 p.m. (EDT) today (March
31, 2011.) The RFP is available at http://masslnc.cwmars.org/node/2301
Does anyone have a good regular expression that will match all legal LC
Call Numbers from the LC Classified Schedule, but will generally not
match things that could not possibly be an LC Call Number from the LC
Classified Schedule?
In particular, I need it to NOT match an MLC call number,
Check the regexp that Google uses in their call number normalization:
http://code.google.com/p/library-callnumber-lc/wiki/Home
You may want to remove the prefix part, and allow for a fourth cutter.
The folks at UNC pointed me to this a few months ago.
-Tod
On Mar 31, 2011, at 11:29
Thanks, that looks good!
It's hosted on Google Code, but I don't think that code is anything
Google uses, it looks like it's from our very own Bill Dueber.
On 3/31/2011 12:38 PM, Tod Olson wrote:
Check the regexp that Google uses in their call number normalization:
Except now I wonder if those annoying MLCS call numbers might actually
be properly MATCHED by this regex, when I need em excluded. They are
annoying _similar_ to a classified call number. Well, one way to find out.
And the reason this matters is to try and use an LCC to map to a
'discipline'
The Google Code regex looks like it will accept any 1-3 letters at the
start of the call number. But LCC has no I, O, W, X, or Y
classifications.
So you might want to use something more like ^[A-HJ-NP-VZ] at the
start of the regex.
Also, there are only a few major classifications that use three
Hi Jonathan,
Although designed for a different purpose, you might want to take a look at the
regex in the LC call number sorting utilities on this page:
http://rocky.uta.edu/doran/sortlc/
Note that unparsable call numbers printed to STDERR with error message. So you
could run it against a
You could also try to use the code I put in SolrMarc utilities classes
ha ha ha.
- Naomi
On Mar 31, 2011, at 10:25 AM, Keith Jenkins wrote:
The Google Code regex looks like it will accept any 1-3 letters at the
start of the call number. But LCC has no I, O, W, X, or Y
classifications.
So
Below is an abbreviated digital librarian job description -- a grant-funded
temporary position here at Notre Dame:
The overall goal of the Vector Control Development Network (VCDN)
is to develop an analytical framework for the evaluation of the
transmission of vector borne diseases to
[Forwarded on behalf of Nancy McGovern nancy...@umich.edu --ELM]
Call for Applications
We are very pleased that our colleagues at the University at Albany, SUNY will
host the five-day
Digital Preservation Management workshop this June in Albany, New York.
Application form available on April
Here is exciting news regarding resource sharing and discovery using the
NCIP protocol.
The open source community of software developers working on and
supporting the eXtensible Catalog's (XC) NCIP Toolkit,
http://code.google.com/p/xcncip2toolkit/ is pleased to announce the
release of new
Hey all,
I 3 today's LCC thread, and ones like it.
It seems like there's a ton of knowledge out there (buried) about parsing
various pieces of library data like this, but I haven't really seen a concerted
effort to log/organize this info in one place. It seems like such a thing could
be a
I strongly suggest taking a look at GATE (http://gate.ac.uk) and UIMA (
http://uima-framework.sourceforge.net/ ).
GATE can use UIMA workflows as processing resource. UIMA can use GATE
workflows as processing resources.
Don't cross the streams...
Simon
On Thu, Mar 31, 2011 at 5:22 PM,
Hi Jason,
I started a page on the wiki:
http://wiki.code4lib.org/index.php/Parsing_Library_Data
Cool idea. I added a link under the Title section to a small code snippet for
parsing titles to determine the number of nonfiling characters (for when
converting non-MARC data to MARC).
--
On 28 March 2011, Ford, Kevin wrote:
I couldn't get Simon's MARC 21 Magic file to work. Among other issues,
I received line too long errors. But, since I've been curious about
this for sometime, I figured I'd take a whack at it myself. Try this:
This is very nice! Thanks. I tried it on
15 matches
Mail list logo