Ok, maybe I was not very clear in my previous mail: the problem was
whether to use blank nodes or string concatenation (or a proposal for a
better solution).
We can discuss the format in a different mail thread :)

Alessio


Il 10/06/13 20:38, Paul A. Houle ha scritto:
I am a fan of the SPARQL result set format whenever people want to
express tuples of nodes:
http://www.w3.org/TR/sparql11-results-csv-tsv/
I think it’s more standard than Turtle, and it is as efficient as
you’ll get unless you want a binary format.
This file can be processed with simple streaming tools like awk or
even passed into something like Pig.  If you want to load the facts
into the triple store you could just toss out the relevance rating or
filter only facts where the relevance rating is 9 or above.  If you
wanted to produce the kind of RDF you suggest,  you could do that
too.  You could also md5sum the triples and stuff the relevance data
in a key-value store where it won’t add load to the triple store.
*From:* Alessio Palmero Aprosio <mailto:[email protected]>
*Sent:* Monday, June 10, 2013 11:15 AM
*To:* dbpedia-discussion
<mailto:[email protected]>
*Subject:* [Dbpedia-discussion] Airpedia resource
Dear DBpedia community,
I am a PhD student from Fondazione Bruno Kessler [1] in Trento and I’m
working with my team on Airpedia [2], which is a semantic resource
based on machine learning techniques that aims to extend the coverage
of DBpedia on classes (and, in a second step, on properties).

A draft version of the resource is available on our website. we are
currently working on releasing it to the Semantic Web Community, and
investigating on the best RDF format to use.

Actually, we use a simple CSV format. For example:

#ID    Class    Relevance
140132    Eukaryote    10
140132    Animal    10
140132    Fish    10
140132    Species    10
140137    OlympicResult    8
140143    Eukaryote    10
140143    Amphibian    10
140143    Animal    10
140143    Species    10

The ID column refers to a WikiData ID, and can be solved on the
WikiData website on the link http://wikidata.org/wiki/Q<ID>; the Class
column is the guessed DBpedia class; the Relevance column is our
confidence about the class (from 7 to 10, being a k-NN voting, k=10).
It is really easy for us to retrieve the DBpedia ID given the WikiData ID.

Which is, in your opinion, the best way to represent this data in RDF,
keeping in mind that we want to differentiate our triples from the
original DBpedia ones and we want the relevance to be preserved?

We have in mind the folowing candidate solutions.
(“air” is our RDF namespace)


*Solution 1**(string concatenation)*

  * ID air:type Class .
  * ID_Class air:confidence Relevance .
  * sameAses

For example:
140132    Eukaryote    10
140132    Animal    10
140132    Fish    10
140132    Species    10

becomes:
<http://airpedia.org/resource/140132>
<http://airpedia.org/vocab/01/#type>
<http://dbpedia.org/ontology/Eukaryote> .
<http://airpedia.org/resource/140132_Eukaryote>
<http://airpedia.org/vocab/01/#confidence> “10”^^xsd:int .
<http://airpedia.org/resource/140132>
<http://airpedia.org/vocab/01/#type>
<http://dbpedia.org/ontology/Animal> .
<http://airpedia.org/resource/140132_Animal>
<http://airpedia.org/vocab/01/#confidence> “10”^^xsd:int .
<http://airpedia.org/resource/140132>
<http://airpedia.org/vocab/01/#type> <http://dbpedia.org/ontology/Fish> .
<http://airpedia.org/resource/140132_Fish>
<http://airpedia.org/vocab/01/#confidence> “10”^^xsd:int .
<http://airpedia.org/resource/140132>
<http://airpedia.org/vocab/01/#type>
<http://dbpedia.org/ontology/Species> .
<http://airpedia.org/resource/140132_Species>
<http://airpedia.org/vocab/01/#confidence> “10”^^xsd:int .
<http://airpedia.org/resource/140132> owl:sameAs
<http://dbpedia.org/resource/Big_skate> .
<http://airpedia.org/resource/140132> owl:sameAs
<http://ca.dbpedia.org/resource/Raja_binoculata> .
...


*Solution 2**(blank nodes)*

  * ID air:isClassified bNode
  * bNode air:type Class
  * bNode air:confidence Relevance
  * sameAses

For example:
140132    Eukaryote    10
140132    Animal    10
140132    Fish    10
140132    Species    10

becomes:
<http://airpedia.org/resource/140132>
<http://airpedia.org/vocab/01/#isClassified> _:1 .
_:1 <http://airpedia.org/vocab/01/#type>
<http://dbpedia.org/ontology/Eukaryote> .
_:1 <http://airpedia.org/vocab/01/#confidence> “10”^^xsd:int .
<http://airpedia.org/resource/140132>
<http://airpedia.org/vocab/01/#isClassified> _:2 .
_:2 <http://airpedia.org/vocab/01/#type>
<http://dbpedia.org/ontology/Fish> .
_:2 <http://airpedia.org/vocab/01/#confidence> “10”^^xsd:int .
<http://airpedia.org/resource/140132>
<http://airpedia.org/vocab/01/#isClassified> _:3 .
_:3 <http://airpedia.org/vocab/01/#type>
<http://dbpedia.org/ontology/Animal> .
_:3 <http://airpedia.org/vocab/01/#confidence> “10”^^xsd:int .
<http://airpedia.org/resource/140132>
<http://airpedia.org/vocab/01/#isClassified> _:4 .
_:4 <http://airpedia.org/vocab/01/#type>
<http://dbpedia.org/ontology/Species> .
_:4 <http://airpedia.org/vocab/01/#confidence> “10”^^xsd:int .
<http://airpedia.org/resource/140132> owl:sameAs
<http://dbpedia.org/resource/Big_skate> .
<http://airpedia.org/resource/140132> owl:sameAs
<http://ca.dbpedia.org/resource/Raja_binoculata> .
…


While waiting for your suggestions, we finish the classification and
make the CSV available on our website [2].

Thank you!
Best,
Alessio

[1] http://www.fbk.eu
[2] http://www.airpedia.org

------------------------------------------------------------------------
------------------------------------------------------------------------------
How ServiceNow helps IT people transform IT departments:
1. A cloud service to automate IT design, transition and operations
2. Dashboards that offer high-level views of enterprise services
3. A single system of record for all IT processes
http://p.sf.net/sfu/servicenow-d2d-j

------------------------------------------------------------------------
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to