I am a fan of the SPARQL result set format whenever people want to express 
tuples of nodes:

http://www.w3.org/TR/sparql11-results-csv-tsv/

I think it’s more standard than Turtle, and it is as efficient as you’ll get 
unless you want a binary format.

This file can be processed with simple streaming tools like awk or even passed 
into something like Pig.  If you want to load the facts into the triple store 
you could just toss out the relevance rating or filter only facts where the 
relevance rating is 9 or above.  If you wanted to produce the kind of RDF you 
suggest,  you could do that too.  You could also md5sum the triples and stuff 
the relevance data in a key-value store where it won’t add load to the triple 
store.

From: Alessio Palmero Aprosio 
Sent: Monday, June 10, 2013 11:15 AM
To: dbpedia-discussion 
Subject: [Dbpedia-discussion] Airpedia resource

Dear DBpedia community,
I am a PhD student from Fondazione Bruno Kessler [1] in Trento and I’m working 
with my team on Airpedia [2], which is a semantic resource based on machine 
learning techniques that aims to extend the coverage of DBpedia on classes 
(and, in a second step, on properties).

A draft version of the resource is available on our website. we are currently 
working on releasing it to the Semantic Web Community, and investigating on the 
best RDF format to use.

Actually, we use a simple CSV format. For example:

#ID    Class    Relevance
140132    Eukaryote    10
140132    Animal    10
140132    Fish    10
140132    Species    10
140137    OlympicResult    8
140143    Eukaryote    10
140143    Amphibian    10
140143    Animal    10
140143    Species    10

The ID column refers to a WikiData ID, and can be solved on the WikiData 
website on the link http://wikidata.org/wiki/Q<ID>; the Class column is the 
guessed DBpedia class; the Relevance column is our confidence about the class 
(from 7 to 10, being a k-NN voting, k=10). It is really easy for us to retrieve 
the DBpedia ID given the WikiData ID.

Which is, in your opinion, the best way to represent this data in RDF, keeping 
in mind that we want to differentiate our triples from the original DBpedia 
ones and we want the relevance to be preserved?

We have in mind the folowing candidate solutions.
(“air” is our RDF namespace)


Solution 1 (string concatenation)

  a.. ID air:type Class . 
  b.. ID_Class air:confidence Relevance . 
  c.. sameAses 
For example:
140132    Eukaryote    10
140132    Animal    10
140132    Fish    10
140132    Species    10

becomes:
<http://airpedia.org/resource/140132> <http://airpedia.org/vocab/01/#type> 
<http://dbpedia.org/ontology/Eukaryote> .
<http://airpedia.org/resource/140132_Eukaryote> 
<http://airpedia.org/vocab/01/#confidence> “10”^^xsd:int .
<http://airpedia.org/resource/140132> <http://airpedia.org/vocab/01/#type> 
<http://dbpedia.org/ontology/Animal> .
<http://airpedia.org/resource/140132_Animal> 
<http://airpedia.org/vocab/01/#confidence> “10”^^xsd:int .
<http://airpedia.org/resource/140132> <http://airpedia.org/vocab/01/#type> 
<http://dbpedia.org/ontology/Fish> .
<http://airpedia.org/resource/140132_Fish> 
<http://airpedia.org/vocab/01/#confidence> “10”^^xsd:int .
<http://airpedia.org/resource/140132> <http://airpedia.org/vocab/01/#type> 
<http://dbpedia.org/ontology/Species> .
<http://airpedia.org/resource/140132_Species> 
<http://airpedia.org/vocab/01/#confidence> “10”^^xsd:int .
<http://airpedia.org/resource/140132> owl:sameAs 
<http://dbpedia.org/resource/Big_skate> .
<http://airpedia.org/resource/140132> owl:sameAs 
<http://ca.dbpedia.org/resource/Raja_binoculata> .
...


Solution 2 (blank nodes)

  a.. ID air:isClassified bNode 
  b.. bNode air:type Class 
  c.. bNode air:confidence Relevance 
  d.. sameAses 
For example:
140132    Eukaryote    10
140132    Animal    10
140132    Fish    10
140132    Species    10

becomes:
<http://airpedia.org/resource/140132> 
<http://airpedia.org/vocab/01/#isClassified> _:1 .
_:1 <http://airpedia.org/vocab/01/#type> 
<http://dbpedia.org/ontology/Eukaryote> .
_:1 <http://airpedia.org/vocab/01/#confidence> “10”^^xsd:int .
<http://airpedia.org/resource/140132> 
<http://airpedia.org/vocab/01/#isClassified> _:2 .
_:2 <http://airpedia.org/vocab/01/#type> <http://dbpedia.org/ontology/Fish> .
_:2 <http://airpedia.org/vocab/01/#confidence> “10”^^xsd:int .
<http://airpedia.org/resource/140132> 
<http://airpedia.org/vocab/01/#isClassified> _:3 .
_:3 <http://airpedia.org/vocab/01/#type> <http://dbpedia.org/ontology/Animal> .
_:3 <http://airpedia.org/vocab/01/#confidence> “10”^^xsd:int .
<http://airpedia.org/resource/140132> 
<http://airpedia.org/vocab/01/#isClassified> _:4 .
_:4 <http://airpedia.org/vocab/01/#type> <http://dbpedia.org/ontology/Species> .
_:4 <http://airpedia.org/vocab/01/#confidence> “10”^^xsd:int .
<http://airpedia.org/resource/140132> owl:sameAs 
<http://dbpedia.org/resource/Big_skate> .
<http://airpedia.org/resource/140132> owl:sameAs 
<http://ca.dbpedia.org/resource/Raja_binoculata> .
…


While waiting for your suggestions, we finish the classification and make the 
CSV available on our website [2].

Thank you!
Best,
Alessio

[1] http://www.fbk.eu
[2] http://www.airpedia.org 


--------------------------------------------------------------------------------
------------------------------------------------------------------------------
How ServiceNow helps IT people transform IT departments:
1. A cloud service to automate IT design, transition and operations
2. Dashboards that offer high-level views of enterprise services
3. A single system of record for all IT processes
http://p.sf.net/sfu/servicenow-d2d-j 


--------------------------------------------------------------------------------
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to