Hi Peter,
I see that you have already a dataset dump available. Could I suggest
also the use of a semantic sitemap [1], so that search engines such as
Sindice can find, process and index your dump.
Best,
[1]http://sw.deri.org/2007/07/sitemapextension
--
Renaud Delbru
On 29/10/09 21:10, Peter DeVries wrote:
I have updated the GeoSpecies data set.
You can read about it here:
http://about.geospecies.org/
You can browse it here:
http://lod.geospecies.org/
The RDF dump can be obtained here:
Here is the new RDF dump
http://lod.geospecies.org/geospecies.rdf.tar.gz (1,765,790 Triples)
The data set currently contains information and linked data for:
15,862 Species, 1,291 Familes, 206 Orders. We have approximately 6,500
species observations, but are awaiting release on the majority of
those. The current data set includes 12 sample observation records
with geo and geonames links. There is also a growing number of
GeoSpecies annotated articles and presentations in the bibtex and
bibio vocabularies. The knowledge base is currently linked to DBpedia,
Freebase, Bio2RDF, Uniprot, uBio data sources, and uses some of the
umbel subject concepts. See the projects page information on proper
attribution. Until they have been fully documented, the bulk of the
observation records are not currently available.
I have attempted to link to dbpedia, bio2rdf, uniprot and freebase
when possible using skos:closeMatch. Of the 15,862 species, 5,684 are
linked to dbpedia and wikipedia, 8,948 are linked to bio2rdf and
uniprot. There are also foaf:isPrimaryTopicOf links to 8,910
Wikispecies pages. Similar linkages are made at the other taxonomic
levels of kingdom, phylum, class, order and family.
Here the the page for the Silver-bordered Fritillary Butterfly Boloria
selene Denis and Schiffermuller 1775
http://lod.geospecies.org/ses/ICmLC.html
The "entity" is
http://lod.geospecies.org/ses/ICmLC
The RDF is
http://lod.geospecies.org/ses/ICmLC.rdf
The levels above species and family are in XHTML with RDFa, but also
have a straight RDF representation.
Order Carnivora
http://lod.geospecies.org/orders/jtSaY.xhtml
RDF version
http://lod.geospecies.org/orders/jtSaY.rdf
This page has some example SPARQL queries.
http://about.geospecies.org/sparql.xhtml
You can find the ontology documentation here:
http://rdf.geospecies.org/gs_ont_doc/index.html
It is mainly a vocabulary, since I have had trouble getting all the
related ontologies to play well together.
The SPARQL query examples will work as described on the RDF dataset
without the ontology.
This is only a fraction of the world's species but it includes all the
world's Mammals, and North American Birds.
I will be working to improve the data set's depth, breadth and
linkages overtime, and would appreciate any comments or suggestions :-)
My long term plan is to also add biologically relevant assertions to
allow useful semantic queries about species.
I have started to add state and county level records from the USDA
Plants dataset for Wisconsin, Iowa, Michigan, Minnesota.
In addition, I have started to make links between habitats and species.
- Pete
----------------------------------------------------------------
Pete DeVries
Department of Entomology
University of Wisconsin - Madison
445 Russell Laboratories
1630 Linden Drive
Madison, WI 53706
GeoSpecies Knowledge Base
About the GeoSpecies Knowledge Base
------------------------------------------------------------