A good official list of countries is available from the Library of Congress: http://www.loc.gov/standards/codelists/countries.xml For background, see: http://www.loc.gov/marc/countries/
And of course there's ISO 3166, the list of country codes: http://www.iso.org/iso/home/standards/country_codes/country_names_and_code_elements_xml.htm http://www.iso.org/iso/country_codes Not sure about the alternate representations and misspellings, though. Matt On Fri, May 17, 2013 at 5:57 AM, Shorthouse, David < [email protected]> wrote: > Folks, > > The Canadensys development team, http://www.canadensys.net is looking > for efficient, low-maintenance ways to validate and reconcile data in > its National cache of occurrence data. We are working on a Java > library to initially tackle single-field Darwin Core validations, > https://github.com/Canadensys/narwhal-processor. We hope this library > is sufficiently generalized for uses outside our project. > > Our current challenge is to reconcile country names, which requires > access to an up-to-date, well-maintained knowledge base of country > names, their alternative representations (possibly multilingual), and > mappings to known misspellings. For performance reasons, we'd like > this thesaurus to be embedded in the library, but with the capacity to > be periodically refreshed with data pulled from external resources > such as dbpedia.org. This clearly has ties to semantic web thinking > and, because we're new to the tools and services in this space, we'd > like to solicit pointers and feedback such that we build this part of > our library with maximal benefit to other projects. We started > collecting thoughts here: > https://github.com/Canadensys/narwhal-processor/issues/14. > > Cheers, > > David P. Shorthouse > Christian Gendreau > _______________________________________________ > tdwg mailing list > [email protected] > http://lists.tdwg.org/mailman/listinfo/tdwg >
_______________________________________________ tdwg mailing list [email protected] http://lists.tdwg.org/mailman/listinfo/tdwg
