Thanks, that looks good!
It's hosted on Google Code, but I don't think that code is anything
"Google uses", it looks like it's from our very own Bill Dueber.
On 3/31/2011 12:38 PM, Tod Olson wrote:
Check the regexp that Google uses in their call number normalization:
http://code.google.com/p/library-callnumber-lc/wiki/Home
You may want to remove the prefix part, and allow for a fourth cutter.
The folks at UNC pointed me to this a few months ago.
-Tod
On Mar 31, 2011, at 11:29 AM, Jonathan Rochkind wrote:
Does anyone have a good regular expression that will match all legal LC
Call Numbers from the LC Classified Schedule, but will generally not
match things that could not possibly be an LC Call Number from the LC
Classified Schedule?
In particular, I need it to NOT match an "MLC" call number, which is an
LC assigned call number that shows up in an 050 with no way to
distinguish based on indicators, but isn't actually from the LC
Schedules. Here's an example of an "MLC" call number:
"MLCS 83/5180 (P)"
Hmm, maybe all MLC call numbers begin with MLC, okay I guess I can
exclude them just like that. But it looks like there are also OTHER
things that can show up in the 050 but aren't actually from the
classified schedule, the OCLC documentation even contains an example of
"Microfilm 19072 E".
What a mess, huh? So, yeah, regex anyone?
[You can probably guess why I care if it's from the LC Classified
Schedule or not].
Tod Olson<[email protected]>
Systems Librarian
University of Chicago Library