Peter,
On 17 Sep 2010, at 20:48, Peter DeVries wrote:
I created the SPARQL query below for the TaxonConcept Knowledge Base.
It is based on the earlier one posted by Richard Cyganiak.
I looked through my RDF for predicates that have in and out links to
other
data sets.
It is not clear to me how to count basic web pages that are not
really RDF
resources.
I don't think SPARQL has any easy way of distinguishing wether the
target of a link is “just” a web page or a full-blown RDF resource.
Also where in the CKAN description do you differentiate between in
links and
out links?
An outlink in our parlance is any triple that's hosted on your site
where the one resource is in your namespace and the other is in
someone else's namespace. It doesn't matter which resource is in the
subject or object position.
An inlink is a triple that uses one of your URIs in the subject or
object position, but is hosted by someone else, in another dataset.
The CKAN record for your dataset only records the outlinks of your
dataset.
We find the inlinks by looking at all other CKAN records and see if
any of them reference your dataset.
Best,
Richard
I am posting the query and results here so others might benefit from
them or
inform me of something I may be doing incorrectly.
Below is the query, after that follows the results as text and I
have also
attached a .png of the Virtuoso iSPARQL results.
- Pete
*
*
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX txn: <http://lod.taxonconcept.org/ontology/txn.owl#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX umbel: <http://umbel.org/umbel#>
SELECT ?domain_s ?domain_o (COUNT(*) AS ?count)
WHERE {
{
SELECT (bif:regexp_substr("http://([^/]*)", STR(?s), 1) AS ?
domain_s)
(bif:regexp_substr("http://([^/]*)", STR(?o), 1) AS ?domain_o)
WHERE {
{ ?s owl:sameAs ?o }
UNION
{ ?s skos:exactMatch ?o }
UNION
{ ?s skos:broadMatch ?o }
UNION
{ ?s skos:narrowMatch ?o }
UNION
{ ?s skos:relatedMatch ?o }
UNION
{ ?s skos:closeMatch ?o }
UNION
{ ?s txn:speciesConceptHasSpeciesNameString ?o }
UNION
{ ?s txn:speciesNameStringHasSpeciesTaxonConcept ?o }
UNION
{ ?s txn:speciesConceptHasBasionymNameString ?o }
UNION
{ ?s txn:basionymNameStringHasSpeciesTaxonConcept ?o }
UNION
{ ?s txn:hasPDFVersion ?o }
UNION
{ ?s txn:hasAuthorURI ?o }
UNION
{ ?s foaf:page ?o }
UNION
{ ?s foaf:topic ?o }
UNION
{ ?s txn:inDBpediaClade ?o }
UNION
{ ?s txn:occurrenceInContinent ?o }
UNION
{ ?s txn:occurrenceInStateProvince ?o }
UNION
{ ?s txn:occurrenceInCounty ?o }
UNION
{ ?s txn:isExpectedIn ?o }
UNION
{ ?s txn:hasExpectationOf ?o }
UNION
{ ?s txn:isUnknownAboutIn ?o }
UNION
{ ?s txn:hasUnknownExpectationOf ?o }
UNION
{ ?s txn:isUnexpectedIn ?o }
UNION
{ ?s txn:hasUnknownExpectationOf ?o }
}
}
}
GROUP BY ?domain_s ?domain_o
*
*
*==============================*
*
*
*domain_s** **domain_o** **count*
lod.geospecies.org lod.taxonconcept.org 71757
www.uniprot.org lod.taxonconcept.org 23427
bio2rdf.org lod.taxonconcept.org 23427
dbpedia.org lod.taxonconcept.org 18849
eunis.eea.europa.eu lod.taxonconcept.org 2987
www.bbc.co.uk lod.taxonconcept.org 318
lod.taxonconcept.org lod.geospecies.org 71756
lod.taxonconcept.org www.uniprot.org 23427
lod.taxonconcept.org bio2rdf.org 23656
lod.taxonconcept.org dbpedia.org 95208
lod.taxonconcept.org eunis.eea.europa.eu 5974
lod.taxonconcept.org www.bbc.co.uk 636
rdf.freebase.com lod.taxonconcept.org 119
lod.taxonconcept.org 72
lod.taxonconcept.org rdf.freebase.com 119
lod.taxonconcept.org 24900
sw.opencyc.org lod.taxonconcept.org 24
lod.taxonconcept.org sw.opencyc.org 24
lod.taxonconcept.org gni.globalnames.org 73329
gni.globalnames.org lod.taxonconcept.org 73330
lod.taxonconcept.org www.americanarachnology.org 1
lod.taxonconcept.org assets.geospecies.org 3
lod.taxonconcept.org www.itis.gov 42100
lod.taxonconcept.org data.gbif.org 1154
lod.taxonconcept.org en.wikipedia.org 18849
lod.taxonconcept.org species.wikimedia.org 9328
lod.taxonconcept.org www.eol.org 579
lod.taxonconcept.org www.boldsystems.org 122
lod.taxonconcept.org www.catalogueoflife.org 53
lod.taxonconcept.org bugguide.net 3297
lod.taxonconcept.org lod.taxonconcept.org 287048
assets.geospecies.org media.geospecies.org 5
lod.taxonconcept.org mushroomobserver.org 5
assets.geospecies.org lod.geospecies.org 10
assets.geospecies.org lod.taxonconcept.org 1
static.flickr.com www.flickr.com 33
bugguide.net lod.taxonconcept.org 3297
media.geospecies.org lod.taxonconcept.org 19
ocs.geospecies.org lod.taxonconcept.org 26
media.geospecies.org dbpedia.org 14
assets.geospecies.org dbpedia.org 1
media.geospecies.org lod.geospecies.org 37
mushroomobserver.org lod.taxonconcept.org 5
media.geospecies.org media.geospecies.org 29
ocs.geospecies.org ocs.geospecies.org 53
media.geospecies.org assets.geospecies.org 15
media.geospecies.org static.flickr.com 2
mushroomobserver.org mushroomobserver.org 3
mushroomobserver.org dbpedia.org 1
ocs.geospecies.org sws.geonames.org 39
lod.taxonconcept.org sws.geonames.org 234792
sws.geonames.org lod.taxonconcept.org 128529
--
----------------------------------------------------------------
Pete DeVries
Department of Entomology
University of Wisconsin - Madison
445 Russell Laboratories
1630 Linden Drive
Madison, WI 53706
TaxonConcept Knowledge Base <http://www.taxonconcept.org/> /
GeoSpecies
Knowledge Base <http://lod.geospecies.org/>
About the GeoSpecies Knowledge Base <http://about.geospecies.org/>
------------------------------------------------------------
<interlinking_capture.png>