[CODE4LIB] LCSH and Linked Data

2011-04-07 Thread Owen Stephens
We are working on converting some MARC library records to RDF, and looking at how we handle links to LCSH (id.loc.gov) - and I'm looking for feedback on how we are proposing to do this... I'm not 100% confident about the approach, and to some extent I'm trying to work around the nature of how

Re: [CODE4LIB] utf8 \xC2 does not map to Unicode

2011-04-07 Thread Tod Olson
yaz-marcdump does a really good job of charset and format conversion for MARC records, and is blindingly fast. But yaz-marcdump seems to think there are a lot of separators in the wrong place and bad indicator data, whether treating the records as UTF-8 or MARC-8. The leaders in the records

Re: [CODE4LIB] LCSH and Linked Data

2011-04-07 Thread Owen Stephens
Thanks Tom - very helpful Perhaps this suggests that rather using an order we should check combinations while preserving the order of the original 650 field (I assume this should in theory be correct always - or at least done to the best of the cataloguers knowledge)? So for: 650 _0 $$a

Re: [CODE4LIB] LCSH and Linked Data

2011-04-07 Thread Ya'aqov Ziso
*... Creating possibly invalid headings isn't necessarily a problem - as we won't get a match on id.loc.gov anyway ... *LCSH headings reflect materials cataloged by LC. You may have materials at your UK (or Albania, Tunisia, etc.) which were not cataloged yet at LC, thus nothing

Re: [CODE4LIB] LCSH and Linked Data

2011-04-07 Thread Houghton,Andrew
After having done numerous matching and mapping projects, there are some issues that you will face with your strategy, assuming I understand it correctly. Trying to match a heading starting at the left most subfield and working forward will not necessarily produce correct results when matching

Re: [CODE4LIB] LCSH and Linked Data

2011-04-07 Thread Ya'aqov Ziso
Andrew, please see *[YZ]* below *181 __ $z England and you would NOT find this heading in LCSH. This is issue one. Unfortunately, LC does not create 181 in LCSH (actually I think there are some, but not if it’s a name), instead they create a 781 in the name authority record. * *[YZ]* MARC/LCSH

Re: [CODE4LIB] LCSH and Linked Data

2011-04-07 Thread Owen Stephens
Still digesting Andrew's response (thanks Andrew), but On Thu, Apr 7, 2011 at 4:17 PM, Ya'aqov Ziso yaaq...@gmail.com wrote: *Currently under id.loc.gov you will not find name authority records, but you can find them at viaf.org*. *[YZ]* viaf.org does not include geographic names. I just

Re: [CODE4LIB] LCSH and Linked Data

2011-04-07 Thread LeVan,Ralph
If you look at the fields those names come from, I think they mean England as a corporation, not England as a place. Ralph -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Owen Stephens Sent: Thursday, April 07, 2011 11:28 AM To:

Re: [CODE4LIB] LCSH and Linked Data

2011-04-07 Thread Ford, Kevin
Actually, it appears to depend on whose Authority record you're looking at. The Canadians, Australians, and Israelis have it as a CorporateName (110), as do the French (210 - unimarc); LC and the Germans say it's a Geographic Name. In the case of LCSH, therefore, it would be a 151.

Re: [CODE4LIB] LCSH and Linked Data

2011-04-07 Thread Ya'aqov Ziso
Ralph, Owen's pointing to a list where corporate (110) and geographic names (151) are mixed. Thanks Owen, I haven't seen that the first time. I guess you got that mixed 110/151 when limiting to 'exact name'. Perhaps Andrew has a workaround. *Ya'aqov* On Thu, Apr 7, 2011 at 10:34 AM,

Re: [CODE4LIB] LCSH and Linked Data

2011-04-07 Thread Owen Stephens
I'm out of my depth here :) But... this is what I understood Andrew to be saying. In this instance (?because 'England' is a Name Authority?) rather than create a separate LCSH authority record for 'England' (as the 151), rather the LCSH subdivision is recorded in the 781 of the existing Name

Re: [CODE4LIB] LCSH and Linked Data

2011-04-07 Thread Ya'aqov Ziso
Kevin, England exists as a corporate body and also as a geographic name. BOTH entities exist in LCSH. This doesn't apply to all geographic names, only to some. Andrew pointed us to VIAF, but I expect his algorithm to limit the search for LCSH. Let's wait for his reply. *Ya'aqov* *On Thu, Apr

Re: [CODE4LIB] LCSH and Linked Data

2011-04-07 Thread LeVan,Ralph
More confusing yet, if you look at the raw XML for that record (add viaf.xml to the end of the URI and then view source) you’ll see that the name type is indeed Geographic. My boss is puzzled. Ralph From: Ya'aqov Ziso [mailto:yaaq...@gmail.com] Sent: Thursday, April 07, 2011 11:56

Re: [CODE4LIB] LCSH and Linked Data

2011-04-07 Thread Jonathan Rochkind
On 4/7/2011 10:46 AM, Houghton,Andrew wrote: to go to the name authority record 150 England with LCCN n82068148. Currently under id.loc.gov you will not find name authority records, If this would change, so name authority record elements used in 6xx subject cataloging were in id.loc.gov, it

Re: [CODE4LIB] LCSH and Linked Data

2011-04-07 Thread Ya'aqov Ziso
Jonathan, hi and thanks, 1. I believe id.loc.gov includes a list of MARC countries and a list for geographic areas (based on the geographic names in 151 fields. 2. cataloging rules instruct catalogers to use THOSE very name forms in 151 $a when a subject can be divided (limited) geographically

Re: [CODE4LIB] LCSH and Linked Data

2011-04-07 Thread Houghton,Andrew
1. No disagreement, except that some 151 appears in the name file and some appear in the subject file: n82068148 008/11=a 008/14=a 151 _ _ $a England sh2010015057008/11=a 008/14=b 151 _ _ $a Tabasco Mountains (Mexico) 2.

Re: [CODE4LIB] LCSH and Linked Data

2011-04-07 Thread Houghton,Andrew
That is probably correct. England may appear as both a 110 *and* a 151 because the 110 signifies the concept for the country entity while the 151 signifies the concept for the geographic place. A subtle distinction... Andy. -Original Message- From: Code for Libraries

Re: [CODE4LIB] LCSH and Linked Data

2011-04-07 Thread Ya'aqov Ziso
*Andrew, as always, most helpful news, kindest thanks! more [YZ] below:* *1. No disagreement, except that some 151 appears in the name file and some appear in the subject file:* *n82068148 008/11=a 008/14=a 151 _ _ $a England* *sh2010015057008/11=a

Re: [CODE4LIB] LCSH and Linked Data

2011-04-07 Thread Ross Singer
On Thu, Apr 7, 2011 at 12:58 PM, Ya'aqov Ziso yaaq...@gmail.com wrote: 1. I believe id.loc.gov includes a list of MARC countries and a list for geographic areas (based on the geographic names in 151 fields. 2. cataloging rules instruct catalogers to use THOSE very name forms in 151 $a when a

Re: [CODE4LIB] LCSH and Linked Data

2011-04-07 Thread Houghton,Andrew
My bad in (2) that should have been 781 and it’s LC’s way to indicate the geographic form used for a 181 when a heading may be geographically subdivided. The point is, when you are trying to do authority matching/mapping you have to match against the 181’s in LCSH *and* the 781’s in NAF. This

Re: [CODE4LIB] LCSH and Linked Data

2011-04-07 Thread Jonathan Rochkind
On 4/7/2011 1:21 PM, Houghton,Andrew wrote: That is probably correct. England may appear as both a 110 *and* a 151 because the 110 signifies the concept for the country entity while the 151 signifies the concept for the geographic place. A subtle distinction... This starts getting into

Re: [CODE4LIB] [dpla-discussion] Rethinking the library part of DPLA

2011-04-07 Thread Eric Hellman
The DPLA listserv is probably too impractical for most of Code4Lib, but Nate Hill (who's on this list as well) made this contribution there, which I think deserves attention from library coders here. On Apr 5, 2011, at 11:15 AM, Nate Hill wrote: It is awesome that the project Gutenberg stuff