Kingsley Idehen wrote:
>
> One approach, using SIOC ontology:
> Try C as a property of an Entity of Type: sioc:Container that is
> associated with another Entity of Type: sioc:Item that has properties
> for x (Place) and y (Person). De-referencing the sioc:Container URI
> will expose data for C, and via the "sioc:container_of" property you
> will get to data for x and/or y.
>
I've been thinking a lot about this over the weekend.
I think the really valuable thing I could provide isn't the matrix,
but a set of facts that are derived from the matrix but that are
filtered by human effort.
The "A Newspaper is an Organization" fact is a good example. It can
be easily stated in RDFS, and live along side the assertion that "A
Newspaper is a Creative Work". In a detailed model, we could
distinguish between
(1) The New York Times (as a bibliographic entry)
(2) The New York Times Corporation (which also owns the Boston Globe,
WQXR-FM, 15 other newspapers and 15% of the Boston Red Sox)
(3) The Business Unit inside the NYTC which produces the New York Times.
If we follow the principle that "dbpedia is about wikipedia", then
I think we can say that
http://en.wikipedia.org/wiki/The_New_York_Times
is about (1) and (3), and that
http://en.wikipedia.org/wiki/The_New_York_Times_Company
is about (2). So overall, I think it's right to say that
http://en.wikipedia.org/wiki/The_New_York_Times
is both a Work and an Organization. The Work and the Organization
are certainly conflated in the "commonsense" model that most people have.
Thinking about it over the weekend, I've realized that I'm free to
assert any triples I want. I can publish my own additions to the
dbpedia ontology, and people are free to use them or not use them. The
one thing that I shouldn't do is mint new URI's under dbpedia's
namespace: if I did want to add new types or predicates to the dbpedia
ontology, I need to do them in my own namespace.
But then another batch of questions comes up.
In the desktop publishing age, an individual could publish a
newspaper by themselves. However, it wouldn't be much of a newspaper,
not notable enough to be in Wikipedia unless it gets involved in some
bizzare controversy. I think that the fact "A Newspaper is an
Organization" does much more good than harm, but reasoning systems that
use such facts need some ability to deal with uncertainty. (Default logic?)
The "Person and Place are disjoint" assertion is similar, but
worse. In particular, there are about 10 counterexamples. Considering
how many Persons and Places there are, and the nature of Wikipedia,
that's excellent data quality. I've worked on line of business systems
that are a lot worse. What does one do with such an assertion? Reject
the whole data set? Reject the offending triples? Paint the offending
triples red?
I feel comfortable making assertions in SKOS and OWL when I let
concepts like 'disjoint', 'same as', and 'subclass' have the kind of
fuzzy meaning that these terms have for people. I don't feel so
comfortable making them with the semantics given by the standards and
actual implementations of the RDF/RDFS/OWL stack.
The best I can see doing is to split up the assertions that I make
into several groups, putting the "safer" ones together and the more
"dangerous" ones together. That at least gives people some control
about what they're going to use.
------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now. http://p.sf.net/sfu/bobj-july
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion