Kingsley Idehen wrote:
>
> One approach, using SIOC ontology:
> Try C as a property of an Entity of Type: sioc:Container that is 
> associated with another Entity of Type: sioc:Item that has properties 
> for x (Place) and y (Person). De-referencing the sioc:Container URI 
> will expose data for C, and via the "sioc:container_of" property  you 
> will get to data for x and/or y.
>
    I've been thinking a lot about this over the weekend.

    I think the really valuable thing I could provide isn't the matrix,  
but a set of facts that are derived from the matrix but that are 
filtered by human effort.

    The "A Newspaper is an Organization" fact is a good example.  It can 
be easily stated in RDFS,  and live along side the assertion that "A 
Newspaper is a Creative Work".  In a detailed model,  we could 
distinguish between

(1) The New York Times (as a bibliographic entry)
(2) The New York Times Corporation (which also owns the Boston Globe, 
WQXR-FM, 15 other newspapers and 15% of the Boston Red Sox)
(3) The Business Unit inside the NYTC which produces the New York Times.

    If we follow the principle that "dbpedia is about wikipedia",  then 
I think we can say that

http://en.wikipedia.org/wiki/The_New_York_Times

    is about (1) and (3),  and that

http://en.wikipedia.org/wiki/The_New_York_Times_Company

    is about (2).   So overall,  I think it's right to say that

http://en.wikipedia.org/wiki/The_New_York_Times

    is both a Work and an Organization.  The Work and the Organization 
are certainly conflated in the "commonsense" model that most people have.

    Thinking about it over the weekend,  I've realized that I'm free to 
assert any triples I want.  I can publish my own additions to the 
dbpedia ontology,  and people are free to use them or not use them.  The 
one thing that I shouldn't do is mint new URI's under dbpedia's 
namespace:  if I did want to add new types or predicates to the dbpedia 
ontology,  I need to do them in my own namespace.

    But then another batch of questions comes up.

     In the desktop publishing age,  an individual could publish a 
newspaper by themselves.  However,  it wouldn't be much of a newspaper,  
not notable enough to be in Wikipedia unless it gets involved in some 
bizzare controversy.  I think that the fact "A Newspaper is an 
Organization" does much more good than harm,  but reasoning systems that 
use such facts need some ability to deal with uncertainty.  (Default logic?)

    The "Person and Place are disjoint" assertion is similar,  but 
worse.  In particular,  there are about 10 counterexamples.  Considering 
how many Persons and Places there are,  and the nature of Wikipedia,  
that's excellent data quality.  I've worked on line of business systems 
that are a lot worse.  What does one do with such an assertion?   Reject 
the whole data set?  Reject the offending triples?  Paint the offending 
triples red?

    I feel comfortable making assertions in SKOS and OWL when I let 
concepts like 'disjoint',  'same as',  and 'subclass' have the kind of 
fuzzy meaning that these terms have for people.  I don't feel so 
comfortable making them with the semantics given by the standards and 
actual implementations of the RDF/RDFS/OWL stack.

    The best I can see doing is to split up the assertions that I make 
into several groups,  putting the "safer" ones together and the more 
"dangerous" ones together.  That at least gives people some control 
about what they're going to use.
 





------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to