John,

Here's an observation from a bystander ...

On 17 Nov 2008, at 17:17, John Goodwin wrote:
<snip>
This is also a good example of where (IMHO) the domain was perhaps over specified. For example all sorts of things could have publishers, and not the ones listed here. I worry that if you reuse DBpedia "publisher" elsewhere you could get some undesired inferences.

But are the DBpedia classes *intended* for re-use elsewhere? Or do they simply express restrictions that apply *within DBpedia*?

I think that in general it is useful to distinguish between two different kinds of ontologies:

a) Ontologies that express restrictions that are present in a certain dataset. They simply express what's there in the data. In this sense, they are like database schemas: If “Publisher” has a range of “Person”, then it means that the publisher *in this particular dataset* is always a person. That's not an assertion about the world, it's an assertion about the dataset. These ontologies are usually not very re-usable.

b) Ontologies that are intended as a “lingua franca” for data exchange between different applications. They are designed for broad re-use, and thus usually do not add many restrictions. In this sense, they are more like controlled vocabularies of terms. Dublin Core is probably the prototypical example, and FOAF is another good one. They usually don't allow as many interesting inferences.

I think that these two kinds of ontologies have very different requirements. Ontologies that are designed for one of these roles are quite useless if used for the other job. Ontologies that have not been designed for either of these two roles usually fail at both.

Returning to DBpedia, my impression is that the DBpedia ontology is intended mostly for the first role. Maybe it should be understood more as a schema for the DBpedia dataset, and not so much as a re-usable set of terms for use outside of the Wikipedia context. (I might be wrong, I was not involved in its creation.)

Richard

Reply via email to