On Jun 4, 2013, at 5:31 AM, Jan Michelfeit wrote:

> Hi,
> 
>> NULL most often simply represents that the value is not known, in my 
>> experience
> 
> So another conclusion of this discussion can be that unknown is the most 
> sensible default interpretation if the triple is not there and there is no 
> indication of the other cases.
> 
>> I think that you have to ask exactly what is meant and then model it. ... 
>> the purpose of the whole exercise is to construct some RDF that is easy to 
>> query
> 
> My original motivation was actually not modelling such situation, but rather 
> interpreting data from unknown sources.
> 
> An example: Let's have a paper ex:paper. Source A claims "ex:paper 
> ex:reviewedBy ex:Hugh". Source B doesn't have any triple "ex:paper 
> ex:reviewedBy *".
> Now I want to integrate the two sources. Shall the result be "the paper was 
> reviewed by Hugh" or "we are not certain whether the paper's been reviewed 
> because source B says it has not been".

Well, certainly not the second, as B does not say it has not been reviewed. B 
simply makes no assertion about the reviewing. In an open world, it is not 
correct to infer "source says not X" from "source does not say X". If you want 
to be able to say that a paper is not reviewed, then (since RDF does not 
provide you with a built-in "not") you have to find or invent a way to say that 
explicitly. You might for example use OWL-style reasoning and have a class of 
papers reviewed by Hugh (it would be the owl:hasValue ex:Hugh restriction on 
ex:reviewedBy) and then B can say that this paper is not in that class, by 
asserting that its in the complement class. (ex:paper rdf:type 
(owl:complementOf (owl:hasValue ex:reviewedBy ex:Hugh))). But admittedly this 
might be overkill for your example application. 

As to how best to combine data from multiple sources even when they might 
disagree, this is a problem for everyone. But your example would seem to be 
straighforward in the RDF view of things, as your A and B don't actually 
disagree: A just provides some information that B is lacking. One expects such 
things to happen in an open world. 

>> it may be more that the subject of the row is having the property withheld 
>> than the value is a nonVisibleValue.
>> you may well find that there is another field in the DB that actually has 
>> the information already
> 
> Answers in this list have been helpful. The conclusion for me is:
> (1) Don't look just at triples alone, but traverse blank nodes [1], they may 
> bear important information.

Um...blank nodes are parts of triples. I'm not sure what you are intending to 
say here. 

> (2) Dependencies between properties should be considered.
> (3) Conflict resolution should also consider sets of values. In the example 
> above, I would conclude "paper was reviewed"; if ex:reviewedBy was modelled 
> with an RDF collection of reviewers, one empty and one non-empty, I would 
> conclude "we don't know".

Why would you come to that conclusion? 

>> I would always avoid bnodes if it is possible/sensible to do - generating a 
>> URI is not hard, and can be useful in the long run.
> 
> BNodes would be actually useful in my particular use case. There are the only 
> thing which can distinguish an "entity" from a "structured attribute" if we 
> don't know anything about the source.

Again, I don't know what you are talking about, but whatever it is, blank nodes 
don't sound like they have anything to do with it. They don't distinguish one 
kind of thing from another, for sure. 

Pat

> 
> Regards,
> Jan
> 
> [1] http://www.w3.org/Submission/CBD/
> 
> 

------------------------------------------------------------
IHMC                                     (850)434 8903 or (650)494 3973   
40 South Alcaniz St.           (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes






Reply via email to