PS: Kyle, that's your own version? That's... sort of kind of machine readable. Well, not really. I can't figure out quite what's going on there, the label/value pairs are just stuffed in single, javascript string literals, seperated by newlines, or sometimes (but sometimes not) with "Assigned code:" strings, etc.

That's in facta little bit harder to parse then what I'm doing against LC. I'm running CSS selectors against the HTML; I'm not having any difficulty parsing, the problem is that the format can change without notice. But yours seems harder to parse to me, am I missing something?

In the end, all I need is a list of pairs, code to label. I'll be looking up from code, so I don't even care about "alternate labels", really.

On 6/22/2011 5:57 PM, Kyle Banerjee wrote:
I went through a process similar to what you describe sometime back for a
tool I made (i.e. I could find no easily downloadable info). You can
download something that will be easier to parse from

http://calculate.alptown.com/gac.js

It's probably not 100% accurate as I haven't downloaded for quite awhile.
But catalogers have me correct errors they discover and there are about 800
unique visitors per day so I assume they notice most things.

It would be nice if this kind of data could be provided in a straightforward
format.

kyle



On Wed, Jun 22, 2011 at 2:44 PM, Jonathan Rochkind<[email protected]>  wrote:

Can anyone remind me if there's a machine readable copy of the MARC
geographic codes available at any persistent URL?

They're in HTML at 
http://www.loc.gov/marc/**geoareas/gacs_code.html<http://www.loc.gov/marc/geoareas/gacs_code.html>.
 I actually had a script that automatically downloaded from there and
"scraped" the HTML -- but sometime since I wrote the script, the HTML
structure on the page changed and it broke.

(I kind of thought that was unlikely since that HTML page itself was
machine generated -- but I guess they changed the software that generated
it. Certainly I knew that scraping HTML was a bad thing to rely on... which
is why I hope LC provides this in some format less likely to change?)



Reply via email to