Hi David S, The short answer is that there is no linkage currently, and the dictionaries in the parser we periodically maintain using google refine.
John W. and I are discussing merging our separate dictionaries to benefit all downstream dependencies (Canadensys for example) - we're only talking about the tab file dictionaries, so the GBIF parser API won't change. I'll pull in the Drupal content as well. Cheers, Tim On May 22, 2013, at 7:12 PM, Shorthouse, David wrote: > Dave & Tim, > > Can you describe the linkage between the Drupal-based GBIF vocabulary > server and the dictionaries in your parsers? Is the former used to > seed the latter? How often does the latter get refreshed from data > produced in the former? Does all that work take place in Refine? If > you have published a white paper on this workflow already, could you > point me to it so I can better understand the depth of the maintenance > costs? > > Cheers, > > David > > On Sat, May 18, 2013 at 8:50 PM, David Remsen <[email protected]> wrote: >> David, >> >> You might like to use the GBIF vocabulary server. It has a multi-lingual >> country name thesaurus based on ISO 3166 and has over 23K terms for 226 ISO >> countries. You can download the data or use the service. It may have some >> lexical variants and misspellings. You can also get an account and add any >> you might know of. And all presented to you in your old friend Drupal. >> Perhaps you might like to serve as curator. Maybe? Diamond in the rough >> here, I'm sure of it. >> >> http://vocabularies.gbif.org/vocabularies/country >> >> Best, >> Dave >> >> ---------------------------------------------------------------------------- >> David Remsen >> Global Biodiversity Information Facility Secretariat >> Universitetsparken 15, DK-2100 Copenhagen, Denmark >> Tel: +1 508 289 7477 Fax: +1 508 289 7900 >> Mobile +1 508 274 4055 >> Skype: dremsen >> ---------------------------------------------------------------------------- >> >> >> >> >> >> On May 17, 2013, at 10:39 AM, Matt Jones wrote: >> >> A good official list of countries is available from the Library of Congress: >> http://www.loc.gov/standards/codelists/countries.xml >> For background, see: http://www.loc.gov/marc/countries/ >> >> And of course there's ISO 3166, the list of country codes: >> >> http://www.iso.org/iso/home/standards/country_codes/country_names_and_code_elements_xml.htm >> http://www.iso.org/iso/country_codes >> >> Not sure about the alternate representations and misspellings, though. >> >> Matt >> >> >> On Fri, May 17, 2013 at 5:57 AM, Shorthouse, David >> <[email protected]> wrote: >>> >>> Folks, >>> >>> The Canadensys development team, http://www.canadensys.net is looking >>> for efficient, low-maintenance ways to validate and reconcile data in >>> its National cache of occurrence data. We are working on a Java >>> library to initially tackle single-field Darwin Core validations, >>> https://github.com/Canadensys/narwhal-processor. We hope this library >>> is sufficiently generalized for uses outside our project. >>> >>> Our current challenge is to reconcile country names, which requires >>> access to an up-to-date, well-maintained knowledge base of country >>> names, their alternative representations (possibly multilingual), and >>> mappings to known misspellings. For performance reasons, we'd like >>> this thesaurus to be embedded in the library, but with the capacity to >>> be periodically refreshed with data pulled from external resources >>> such as dbpedia.org. This clearly has ties to semantic web thinking >>> and, because we're new to the tools and services in this space, we'd >>> like to solicit pointers and feedback such that we build this part of >>> our library with maximal benefit to other projects. We started >>> collecting thoughts here: >>> https://github.com/Canadensys/narwhal-processor/issues/14. >>> >>> Cheers, >>> >>> David P. Shorthouse >>> Christian Gendreau >>> _______________________________________________ >>> tdwg mailing list >>> [email protected] >>> http://lists.tdwg.org/mailman/listinfo/tdwg >> >> >> _______________________________________________ >> tdwg mailing list >> [email protected] >> http://lists.tdwg.org/mailman/listinfo/tdwg >> >> > _______________________________________________ tdwg mailing list [email protected] http://lists.tdwg.org/mailman/listinfo/tdwg
