Re: [CODE4LIB] source of marc geographic code?
The GeographicArea codes have been available from [1] in XML [2] since at least late 2007 [3]. I can't say with 100% certainty that the XML structure has remained perfectly consistent since 2007, but eyeballing the 2007 version and comparing it to currently available file suggests that the structure has remained consistent. The GACS codes are also available from ID, as has been pointed out. The entire list is available for download at [4]. Let me acknowledge, though, that the labels for the URIs (incidentally, the GACS code is the last token of the URI) are not part of the RDF/N-triples/JSON at [5]. This sounds like a feature request - and a useful one at that. Would that be an accurate interpretation of this thread? Cordially, Kevin -- Network Development MARC Standards Office [1] http://www.loc.gov/marc/geoareas/gacshome.html [2] http://www.loc.gov/standards/codelists/gacs.xml [3] http://web.archive.org/web/20071129170212/http://www.loc.gov/marc/geoareas/ [4] http://id.loc.gov/download/ [5] http://id.loc.gov/vocabulary/geographicAreas.html From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Jonathan Rochkind [rochk...@jhu.edu] Sent: Wednesday, June 22, 2011 21:43 To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] source of marc geographic code? The result was that a few meetings later LC announced that they had coded the MARC online pages in XML, and were generating the HTML from that. I think I was mis-understood. No doubt, but man if they'd then just SHARE that XML with us at a persistent URL, and keep the structure of that XML the same, that'd be really useful!
Re: [CODE4LIB] source of marc geographic code?
Huh, that does look like it's got what I need, although it's a bit confusing. I wasn't able to find a URL to a file with the format Karen cites below. I'm probably dense. Can anyone give me the URL that returns a list of all terms with each term having the XML Karen quotes below? It looks like I'd still have to drill down into what should be an opaque identifier to get the actual MARC code. Extract fq from http://id.loc.gov/vocabulary/geographicAreas/fq;, hard-coding in that those URLs will always be of that form, with the last term being the actual MARC code. But that's _probably_ a safe assumption. (Although it wouldn't hurt if they added a data element marcCode or somethign with the actual literal fq i it.) On 6/22/2011 10:35 PM, Karen Coyle wrote: Quoting Jonathan Rochkind rochk...@jhu.edu: Right, so like I keep saying, as far as I can tell, those files are lists of URLs, one for each code. (Or technically lists of RDF-triples, but where two parts of each triple is identical in every triple just saying this URL is part of the marc geographic vocabulary, and then each triple has a unique URL representing a code). And I'd need to do a seperate HTTP request for each code ( a couple hundred?) to actually get the label(s). I'm not sure why you see it as separate requests, unless the downloaded file doesn't work for you -- but maybe I don't understand what you are trying to do. The downloaded full file has the display data and the codes: rdf:Description rdf:about=http://id.loc.gov/vocabulary/geographicAreas/fq; rdf:type rdf:resource=http://www.w3.org/2004/02/skos/core#Concept/ rdf:type rdf:resource=http://www.w3.org/1999/02/22-rdf-syntax-ns#Resource/ owl:sameAs rdf:resource=info:lc/vocabulary/gacs/fq/ skos:prefLabel xml:lang=enAfrica, French-speaking Equatorial/skos:prefLabel skos:notation rdf:datatype=http://www.w3.org/2001/XMLSchema#string;fq/skos:notation skos:inScheme rdf:resource=http://id.loc.gov/vocabulary/geographicAreas/ skos:altLabel xml:lang=enAfrica, Equatorial/skos:altLabel skos:narrower skos:Concept skos:prefLabel xml:lang=enChad, Lake/skos:prefLabel skos:broader rdf:resource=http://id.loc.gov/vocabulary/geographicAreas/fq/ /skos:Concept /skos:narrower skos:altLabel xml:lang=enFrench Equatorial Africa/skos:altLabel skos:altLabel xml:lang=enFrench-speaking Equatorial Africa/skos:altLabel skos:exactMatch rdf:resource=http://id.loc.gov/authorities/sh85001608#concept/ skos:broader rdf:resource=http://id.loc.gov/vocabulary/geographicAreas/f/ vs:term_statusstable/vs:term_status skos:changeNote rdf:nodeID=fq/ so if ypu pick out the value in prefLabel and the value in notation you have what you need, no? (admittedly, this is NOT the same as a simple, comma delimited list!) skos:prefLabel xml:lang=enChad, Lake/skos:prefLabel skos:notation rdf:datatype=http://www.w3.org/2001/XMLSchema#string;fq/skos:notation Am I missing something? That's not a very convenient way to get the data for the very common use case of wanting to construct a mapping from code to label, right? Or that's just me? What would be nice would be a simple XSLT transform that turns out a CSV on the fly, always getting the latest values. No? kc
Re: [CODE4LIB] source of marc geographic code?
On Thu, Jun 23, 2011 at 10:59 AM, Jonathan Rochkind rochk...@jhu.edu wrote: On 6/22/2011 11:25 PM, Ross Singer wrote: Can't you use: http://www.loc.gov/standards/codelists/gacs.xml Yes, I can! I didn't know about/hadn't found that one either hadn't been mentioned until now. Thanks! Where did you find that? That XML file is linked from near the bottom of this page: http://www.loc.gov/marc/geoareas/ Keith
Re: [CODE4LIB] source of marc geographic code?
On 6/22/2011 11:25 PM, Ross Singer wrote: Can't you use: http://www.loc.gov/standards/codelists/gacs.xml ? It's what I used to make marccodes.heroku.com/gacs/ Yes, I can! I didn't know about/hadn't found that one either hadn't been mentioned until now. Thanks! Where did you find that? That's potentially an even more convenient format for my use case than the RDF version. Although like Karen pointed out, not sure why you can't use the RDF/XML from id.loc.gov -Ross. On Wed, Jun 22, 2011 at 5:44 PM, Jonathan Rochkindrochk...@jhu.edu wrote: Can anyone remind me if there's a machine readable copy of the MARC geographic codes available at any persistent URL? They're in HTML at http://www.loc.gov/marc/geoareas/gacs_code.html . I actually had a script that automatically downloaded from there and scraped the HTML -- but sometime since I wrote the script, the HTML structure on the page changed and it broke. (I kind of thought that was unlikely since that HTML page itself was machine generated -- but I guess they changed the software that generated it. Certainly I knew that scraping HTML was a bad thing to rely on... which is why I hope LC provides this in some format less likely to change?)
[CODE4LIB] source of marc geographic code?
Can anyone remind me if there's a machine readable copy of the MARC geographic codes available at any persistent URL? They're in HTML at http://www.loc.gov/marc/geoareas/gacs_code.html . I actually had a script that automatically downloaded from there and scraped the HTML -- but sometime since I wrote the script, the HTML structure on the page changed and it broke. (I kind of thought that was unlikely since that HTML page itself was machine generated -- but I guess they changed the software that generated it. Certainly I knew that scraping HTML was a bad thing to rely on... which is why I hope LC provides this in some format less likely to change?)
Re: [CODE4LIB] source of marc geographic code?
I went through a process similar to what you describe sometime back for a tool I made (i.e. I could find no easily downloadable info). You can download something that will be easier to parse from http://calculate.alptown.com/gac.js It's probably not 100% accurate as I haven't downloaded for quite awhile. But catalogers have me correct errors they discover and there are about 800 unique visitors per day so I assume they notice most things. It would be nice if this kind of data could be provided in a straightforward format. kyle On Wed, Jun 22, 2011 at 2:44 PM, Jonathan Rochkind rochk...@jhu.edu wrote: Can anyone remind me if there's a machine readable copy of the MARC geographic codes available at any persistent URL? They're in HTML at http://www.loc.gov/marc/**geoareas/gacs_code.htmlhttp://www.loc.gov/marc/geoareas/gacs_code.html. I actually had a script that automatically downloaded from there and scraped the HTML -- but sometime since I wrote the script, the HTML structure on the page changed and it broke. (I kind of thought that was unlikely since that HTML page itself was machine generated -- but I guess they changed the software that generated it. Certainly I knew that scraping HTML was a bad thing to rely on... which is why I hope LC provides this in some format less likely to change?) -- -- Kyle Banerjee Digital Services Program Manager Orbis Cascade Alliance baner...@uoregon.edu / 503.877.9773
Re: [CODE4LIB] source of marc geographic code?
And yes, I realize the structure of the data in the file referenced below is idiotic even if it is easier to parse than HTML. But this was part of the first javascript program I ever wrote, and that was back in 1997 when getting real time interaction with browsers was harder (and I never bothered to rewrite). kyle On Wed, Jun 22, 2011 at 2:57 PM, Kyle Banerjee baner...@uoregon.edu wrote: I went through a process similar to what you describe sometime back for a tool I made (i.e. I could find no easily downloadable info). You can download something that will be easier to parse from http://calculate.alptown.com/gac.js It's probably not 100% accurate as I haven't downloaded for quite awhile. But catalogers have me correct errors they discover and there are about 800 unique visitors per day so I assume they notice most things. It would be nice if this kind of data could be provided in a straightforward format. kyle On Wed, Jun 22, 2011 at 2:44 PM, Jonathan Rochkind rochk...@jhu.eduwrote: Can anyone remind me if there's a machine readable copy of the MARC geographic codes available at any persistent URL? They're in HTML at http://www.loc.gov/marc/**geoareas/gacs_code.htmlhttp://www.loc.gov/marc/geoareas/gacs_code.html. I actually had a script that automatically downloaded from there and scraped the HTML -- but sometime since I wrote the script, the HTML structure on the page changed and it broke. (I kind of thought that was unlikely since that HTML page itself was machine generated -- but I guess they changed the software that generated it. Certainly I knew that scraping HTML was a bad thing to rely on... which is why I hope LC provides this in some format less likely to change?) -- -- Kyle Banerjee Digital Services Program Manager Orbis Cascade Alliance baner...@uoregon.edu / 503.877.9773 -- -- Kyle Banerjee Digital Services Program Manager Orbis Cascade Alliance baner...@uoregon.edu / 503.877.9773
Re: [CODE4LIB] source of marc geographic code?
Man, I figured it was there somewhere I just didn't know it. If it's really not there, can we like start a campaign to convince LC that part of maintaining the MARC vocabularies is making them available at a persistent URL, in machine-readable fashion, updated and maintained by them as vocabularies change. Or else, how is any software supposed to use them? Counting on developers to manually review notices of update and manually update local lists is inefficient and entirely unrealistic. On 6/22/2011 5:57 PM, Kyle Banerjee wrote: I went through a process similar to what you describe sometime back for a tool I made (i.e. I could find no easily downloadable info). You can download something that will be easier to parse from http://calculate.alptown.com/gac.js It's probably not 100% accurate as I haven't downloaded for quite awhile. But catalogers have me correct errors they discover and there are about 800 unique visitors per day so I assume they notice most things. It would be nice if this kind of data could be provided in a straightforward format. kyle On Wed, Jun 22, 2011 at 2:44 PM, Jonathan Rochkindrochk...@jhu.edu wrote: Can anyone remind me if there's a machine readable copy of the MARC geographic codes available at any persistent URL? They're in HTML at http://www.loc.gov/marc/**geoareas/gacs_code.htmlhttp://www.loc.gov/marc/geoareas/gacs_code.html. I actually had a script that automatically downloaded from there and scraped the HTML -- but sometime since I wrote the script, the HTML structure on the page changed and it broke. (I kind of thought that was unlikely since that HTML page itself was machine generated -- but I guess they changed the software that generated it. Certainly I knew that scraping HTML was a bad thing to rely on... which is why I hope LC provides this in some format less likely to change?)
Re: [CODE4LIB] source of marc geographic code?
PS: Kyle, that's your own version? That's... sort of kind of machine readable. Well, not really. I can't figure out quite what's going on there, the label/value pairs are just stuffed in single, javascript string literals, seperated by newlines, or sometimes (but sometimes not) with Assigned code: strings, etc. That's in facta little bit harder to parse then what I'm doing against LC. I'm running CSS selectors against the HTML; I'm not having any difficulty parsing, the problem is that the format can change without notice. But yours seems harder to parse to me, am I missing something? In the end, all I need is a list of pairs, code to label. I'll be looking up from code, so I don't even care about alternate labels, really. On 6/22/2011 5:57 PM, Kyle Banerjee wrote: I went through a process similar to what you describe sometime back for a tool I made (i.e. I could find no easily downloadable info). You can download something that will be easier to parse from http://calculate.alptown.com/gac.js It's probably not 100% accurate as I haven't downloaded for quite awhile. But catalogers have me correct errors they discover and there are about 800 unique visitors per day so I assume they notice most things. It would be nice if this kind of data could be provided in a straightforward format. kyle On Wed, Jun 22, 2011 at 2:44 PM, Jonathan Rochkindrochk...@jhu.edu wrote: Can anyone remind me if there's a machine readable copy of the MARC geographic codes available at any persistent URL? They're in HTML at http://www.loc.gov/marc/**geoareas/gacs_code.htmlhttp://www.loc.gov/marc/geoareas/gacs_code.html. I actually had a script that automatically downloaded from there and scraped the HTML -- but sometime since I wrote the script, the HTML structure on the page changed and it broke. (I kind of thought that was unlikely since that HTML page itself was machine generated -- but I guess they changed the software that generated it. Certainly I knew that scraping HTML was a bad thing to rely on... which is why I hope LC provides this in some format less likely to change?)
Re: [CODE4LIB] source of marc geographic code?
Have you looked at id.loc.gov? One of its vocabularies defines URLs for each of the MARC geographic area codes. Stephen On Wed, Jun 22, 2011 at 4:44 PM, Jonathan Rochkind rochk...@jhu.edu wrote: Can anyone remind me if there's a machine readable copy of the MARC geographic codes available at any persistent URL? They're in HTML at http://www.loc.gov/marc/geoareas/gacs_code.html . I actually had a script that automatically downloaded from there and scraped the HTML -- but sometime since I wrote the script, the HTML structure on the page changed and it broke. (I kind of thought that was unlikely since that HTML page itself was machine generated -- but I guess they changed the software that generated it. Certainly I knew that scraping HTML was a bad thing to rely on... which is why I hope LC provides this in some format less likely to change?) -- Stephen Hearn, Metadata Strategist Technical Services, University Libraries University of Minnesota 160 Wilson Library 309 19th Avenue South Minneapolis, MN 55455 Ph: 612-625-2328 Fx: 612-625-3428
Re: [CODE4LIB] source of marc geographic code?
Yes -- it is something I created out of thin air. It was originally designed for catalogers who wanted a visual display to duplicate the print, and achieving adequate performance on interactive search, retrieval, and rendering on the computers/browsers at the time made me have to include all the formatting. To bust it up, split on '@'. That will give you individual records. The labels will tell you the role of information. For example, 'Assigned code(s):\n ' will be followed by newline delimited codes for the rest of the field '\n USE ' indicates a SEE reference, while any line that does not contain newlines simply contains a single code. I realize it sounds nuts, but there aren't that many variations so it's not as bad as it looks. Since you just want pairs, you might want to load values that have codes into a dictionary so when you encounter a SEE reference, you can create a key value pair. The issue with ignoring alternate names is that there are a number of nonintuitive connections that people wouldn't be able to make. kyle On Wed, Jun 22, 2011 at 3:11 PM, Jonathan Rochkind rochk...@jhu.edu wrote: ** PS: Kyle, that's your own version? That's... sort of kind of machine readable. Well, not really. I can't figure out quite what's going on there, the label/value pairs are just stuffed in single, javascript string literals, seperated by newlines, or sometimes (but sometimes not) with Assigned code: strings, etc. That's in fact a little bit harder to parse then what I'm doing against LC. I'm running CSS selectors against the HTML; I'm not having any difficulty parsing, the problem is that the format can change without notice. But yours seems harder to parse to me, am I missing something? In the end, all I need is a list of pairs, code to label. I'll be looking up from code, so I don't even care about alternate labels, really. On 6/22/2011 5:57 PM, Kyle Banerjee wrote: I went through a process similar to what you describe sometime back for a tool I made (i.e. I could find no easily downloadable info). You can download something that will be easier to parse from http://calculate.alptown.com/gac.js It's probably not 100% accurate as I haven't downloaded for quite awhile. But catalogers have me correct errors they discover and there are about 800 unique visitors per day so I assume they notice most things. It would be nice if this kind of data could be provided in a straightforward format. kyle On Wed, Jun 22, 2011 at 2:44 PM, Jonathan Rochkind rochk...@jhu.edu rochk...@jhu.edu wrote: Can anyone remind me if there's a machine readable copy of the MARC geographic codes available at any persistent URL? They're in HTML at http://www.loc.gov/marc/**geoareas/gacs_code.htmlhttp://www.loc.gov/marc/geoareas/gacs_code.html http://www.loc.gov/marc/geoareas/gacs_code.html. I actually had a script that automatically downloaded from there and scraped the HTML -- but sometime since I wrote the script, the HTML structure on the page changed and it broke. (I kind of thought that was unlikely since that HTML page itself was machine generated -- but I guess they changed the software that generated it. Certainly I knew that scraping HTML was a bad thing to rely on... which is why I hope LC provides this in some format less likely to change?) -- -- Kyle Banerjee Digital Services Program Manager Orbis Cascade Alliance baner...@uoregon.edu / 503.877.9773
Re: [CODE4LIB] source of marc geographic code?
Aha, that's probably what I need. And now I remember Ross probably pointed that out to me before. I'm still having trouble figuring out how to get from the rdf-triples it's got there to a hash of codes (as they appear in marc records, not URIs), to labels. It seems like it in fact will be a lot more work than the scraping I'm doing of the HTML page now, but of course the problem with the HTML page is that it's structure is not reliable, it changes. So the structured data from id.loc.gov is the way to go but I'm still getting confused figuring out how to get what I want out of it. If anyone wants to give me any hints, appreciated. It kind of looks like I FIRST have to get the complete list from one of the structured forms (RDF-XML, triple, etc), and THEN make a seperate HTTP request for _each_ term listed in the list to get the code as found in the MARC record and the label. That's a pretty slow process, as well as requiring writing more code than a task like this seems like it should take. Is there anything on that site that can give me the code/label pairs in one single download? On 6/22/2011 6:38 PM, Stephen Hearn wrote: Have you looked at id.loc.gov? One of its vocabularies defines URLs for each of the MARC geographic area codes. Stephen On Wed, Jun 22, 2011 at 4:44 PM, Jonathan Rochkindrochk...@jhu.edu wrote: Can anyone remind me if there's a machine readable copy of the MARC geographic codes available at any persistent URL? They're in HTML at http://www.loc.gov/marc/geoareas/gacs_code.html . I actually had a script that automatically downloaded from there and scraped the HTML -- but sometime since I wrote the script, the HTML structure on the page changed and it broke. (I kind of thought that was unlikely since that HTML page itself was machine generated -- but I guess they changed the software that generated it. Certainly I knew that scraping HTML was a bad thing to rely on... which is why I hope LC provides this in some format less likely to change?)
Re: [CODE4LIB] source of marc geographic code?
I wasn't aware of this, but it definitely didn't exist way back when I started. You can download all the GACs in XML from that page. kyle On Wed, Jun 22, 2011 at 3:38 PM, Stephen Hearn s-h...@umn.edu wrote: Have you looked at id.loc.gov? One of its vocabularies defines URLs for each of the MARC geographic area codes. Stephen On Wed, Jun 22, 2011 at 4:44 PM, Jonathan Rochkind rochk...@jhu.edu wrote: Can anyone remind me if there's a machine readable copy of the MARC geographic codes available at any persistent URL? They're in HTML at http://www.loc.gov/marc/geoareas/gacs_code.html . I actually had a script that automatically downloaded from there and scraped the HTML -- but sometime since I wrote the script, the HTML structure on the page changed and it broke. (I kind of thought that was unlikely since that HTML page itself was machine generated -- but I guess they changed the software that generated it. Certainly I knew that scraping HTML was a bad thing to rely on... which is why I hope LC provides this in some format less likely to change?) -- Stephen Hearn, Metadata Strategist Technical Services, University Libraries University of Minnesota 160 Wilson Library 309 19th Avenue South Minneapolis, MN 55455 Ph: 612-625-2328 Fx: 612-625-3428 -- -- Kyle Banerjee Digital Services Program Manager Orbis Cascade Alliance baner...@uoregon.edu / 503.877.9773
Re: [CODE4LIB] source of marc geographic code?
Quoting Jonathan Rochkind rochk...@jhu.edu: Can anyone remind me if there's a machine readable copy of the MARC geographic codes available at any persistent URL? Not sure how persistent, but here's Ross's version: http://marccodes.heroku.com/gacs/ I often made the point at MARBI meetings (before I just gave up going to them) that all of MARC (tags, subfields, codes) should be available in a machine-readable form.[1] I go NUTS when I see those email notices come around, the idea that all over the world people are manually keying in codes into a local table. Please help us make sure that does not happen in any future formats Make noise now! kc [1] The result was that a few meetings later LC announced that they had coded the MARC online pages in XML, and were generating the HTML from that. I think I was mis-understood. They're in HTML at http://www.loc.gov/marc/geoareas/gacs_code.html . I actually had a script that automatically downloaded from there and scraped the HTML -- but sometime since I wrote the script, the HTML structure on the page changed and it broke. (I kind of thought that was unlikely since that HTML page itself was machine generated -- but I guess they changed the software that generated it. Certainly I knew that scraping HTML was a bad thing to rely on... which is why I hope LC provides this in some format less likely to change?) -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet
Re: [CODE4LIB] source of marc geographic code?
It can be found at http://id.loc.gov/vocabulary/geographicAreas.html Look near the bottom of the page for links to the codes as RDF, N-triples, and JSON. Tom On Wed, Jun 22, 2011 at 6:38 PM, Stephen Hearn s-h...@umn.edu wrote: Have you looked at id.loc.gov? One of its vocabularies defines URLs for each of the MARC geographic area codes. Stephen On Wed, Jun 22, 2011 at 4:44 PM, Jonathan Rochkind rochk...@jhu.edu wrote: Can anyone remind me if there's a machine readable copy of the MARC geographic codes available at any persistent URL? They're in HTML at http://www.loc.gov/marc/geoareas/gacs_code.html . I actually had a script that automatically downloaded from there and scraped the HTML -- but sometime since I wrote the script, the HTML structure on the page changed and it broke. (I kind of thought that was unlikely since that HTML page itself was machine generated -- but I guess they changed the software that generated it. Certainly I knew that scraping HTML was a bad thing to rely on... which is why I hope LC provides this in some format less likely to change?) -- Stephen Hearn, Metadata Strategist Technical Services, University Libraries University of Minnesota 160 Wilson Library 309 19th Avenue South Minneapolis, MN 55455 Ph: 612-625-2328 Fx: 612-625-3428
Re: [CODE4LIB] source of marc geographic code?
It can be found at http://id.loc.gov/vocabulary/geographicAreas.html Look near the bottom of the page for links to the codes as RDF, N-triples, and JSON. Right, so like I keep saying, as far as I can tell, those files are lists of URLs, one for each code. (Or technically lists of RDF-triples, but where two parts of each triple is identical in every triple just saying this URL is part of the marc geographic vocabulary, and then each triple has a unique URL representing a code). And I'd need to do a seperate HTTP request for each code ( a couple hundred?) to actually get the label(s). Am I missing something? That's not a very convenient way to get the data for the very common use case of wanting to construct a mapping from code to label, right? Or that's just me?
Re: [CODE4LIB] source of marc geographic code?
The result was that a few meetings later LC announced that they had coded the MARC online pages in XML, and were generating the HTML from that. I think I was mis-understood. No doubt, but man if they'd then just SHARE that XML with us at a persistent URL, and keep the structure of that XML the same, that'd be really useful!
Re: [CODE4LIB] source of marc geographic code?
Quoting Jonathan Rochkind rochk...@jhu.edu: Right, so like I keep saying, as far as I can tell, those files are lists of URLs, one for each code. (Or technically lists of RDF-triples, but where two parts of each triple is identical in every triple just saying this URL is part of the marc geographic vocabulary, and then each triple has a unique URL representing a code). And I'd need to do a seperate HTTP request for each code ( a couple hundred?) to actually get the label(s). I'm not sure why you see it as separate requests, unless the downloaded file doesn't work for you -- but maybe I don't understand what you are trying to do. The downloaded full file has the display data and the codes: rdf:Description rdf:about=http://id.loc.gov/vocabulary/geographicAreas/fq; rdf:type rdf:resource=http://www.w3.org/2004/02/skos/core#Concept/ rdf:type rdf:resource=http://www.w3.org/1999/02/22-rdf-syntax-ns#Resource/ owl:sameAs rdf:resource=info:lc/vocabulary/gacs/fq/ skos:prefLabel xml:lang=enAfrica, French-speaking Equatorial/skos:prefLabel skos:notation rdf:datatype=http://www.w3.org/2001/XMLSchema#string;fq/skos:notation skos:inScheme rdf:resource=http://id.loc.gov/vocabulary/geographicAreas/ skos:altLabel xml:lang=enAfrica, Equatorial/skos:altLabel skos:narrower skos:Concept skos:prefLabel xml:lang=enChad, Lake/skos:prefLabel skos:broader rdf:resource=http://id.loc.gov/vocabulary/geographicAreas/fq/ /skos:Concept /skos:narrower skos:altLabel xml:lang=enFrench Equatorial Africa/skos:altLabel skos:altLabel xml:lang=enFrench-speaking Equatorial Africa/skos:altLabel skos:exactMatch rdf:resource=http://id.loc.gov/authorities/sh85001608#concept/ skos:broader rdf:resource=http://id.loc.gov/vocabulary/geographicAreas/f/ vs:term_statusstable/vs:term_status skos:changeNote rdf:nodeID=fq/ so if ypu pick out the value in prefLabel and the value in notation you have what you need, no? (admittedly, this is NOT the same as a simple, comma delimited list!) skos:prefLabel xml:lang=enChad, Lake/skos:prefLabel skos:notation rdf:datatype=http://www.w3.org/2001/XMLSchema#string;fq/skos:notation Am I missing something? That's not a very convenient way to get the data for the very common use case of wanting to construct a mapping from code to label, right? Or that's just me? What would be nice would be a simple XSLT transform that turns out a CSV on the fly, always getting the latest values. No? kc -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet
Re: [CODE4LIB] source of marc geographic code?
Can't you use: http://www.loc.gov/standards/codelists/gacs.xml ? It's what I used to make marccodes.heroku.com/gacs/ Although like Karen pointed out, not sure why you can't use the RDF/XML from id.loc.gov -Ross. On Wed, Jun 22, 2011 at 5:44 PM, Jonathan Rochkind rochk...@jhu.edu wrote: Can anyone remind me if there's a machine readable copy of the MARC geographic codes available at any persistent URL? They're in HTML at http://www.loc.gov/marc/geoareas/gacs_code.html . I actually had a script that automatically downloaded from there and scraped the HTML -- but sometime since I wrote the script, the HTML structure on the page changed and it broke. (I kind of thought that was unlikely since that HTML page itself was machine generated -- but I guess they changed the software that generated it. Certainly I knew that scraping HTML was a bad thing to rely on... which is why I hope LC provides this in some format less likely to change?)