Is there anyone out there with experience processing the raw data files for the 
Getty vocabularies (particularly TGN)?
 
We're adopting AAT and TGN as the primary vocabularies for our new shared 
cataloging system for our museum, library and archival collections. I'm 
presently trying to come up with some scripts to automate matching of places in 
existing databases to places in the TGN taxonomy. But I'm finding that the 
Getty data files are very complex, and I haven't yet figured out a foolproof 
method to do this. I'm curious if anyone else has traveled this road before, 
and if so whether you might be able to share some tips or code snippets.
 
Since most of our place names are going to be in the US, my gut feeling has 
been to first try to extract a list of places in the US and dump things like 
state, county, etc. into discrete database fields that I can match against. But 
I find myself a bit flummoxed by the polyhierarchical nature of the data (where 
one place can belong to multiple higher level places).
 
Another issue is the wide variety of place types in use in the taxonomy. 
England, for example, is a country, but the United States is a nation. This 
makes sense to a degree, but it also makes it a bit hard to figure out which 
term to match when you're trying to automate matching against data where the 
creators were less discerning about this sort of fine distinction.
 
I feel like I'm surely not the first person to tackle this, and would love to 
exchange notes...
 
-David Dwiggins
 
 
 
 
 
 
__________
 
David Dwiggins
Systems Librarian/Archivist, Historic New England
141 Cambridge Street, Boston, MA 02114
(617) 227-3956 x 242 
[email protected] 
http://www.historicnewengland.org ( http://www.historicnewengland.org/ )

Visit http://www.LymanEstate.org for information on renting the historic Lyman 
Estate for your next event - a very special place for very special occasions.

Reply via email to