Re: [Dbpedia-discussion] Concept Identifiers

Markus Kroetzsch Wed, 01 Jun 2016 12:57:03 -0700

Hi Sebastian,

I'll try to clarify further. This really is a tricky topic and maybe 
more than an email thread is needed to explain this. If you want to dive 
into the details, you may want to check out some textbooks to get 
started (Abiteboul et al. would be the standard intro to database theory 
and Relational Algebra; for FOL there are many choices, but there is no 
DL-specific textbook; there are some good DL tutorials, however, that 
may be useful). I don't know of a good reference that explains the 
differences that are causing confusion here.

On 01.06.2016 17:03, Sebastian Hellmann wrote:
> Hi Markus,
>
> On 01.06.2016 14:49, Markus Kroetzsch wrote:
>> Hi Sebastian,
>>
>> On 01.06.2016 13:07, Sebastian Hellmann wrote:
>>> Hi Markus,
>>>
>>> On 01.06.2016 12:58, Markus Kroetzsch wrote:
>>>> On 01.06.2016 10:46, Sebastian Hellmann wrote:
>>>>> https://en.wikipedia.org/wiki/Unique_name_assumption
>>>>
>>>> The UNA is a principle in formal logic and knowledge representation.
>>>> It is not really related to this discussion. For example, standard
>>>> DBMS all make the UNA, but you can still have many identifiers (keys)
>>>> for the same object in a database.
>>>
>>> Then the database does not use UNA. The above sentence reads like you
>>> could have two primary keys, but then still have them pointing to the
>>> same row.
>>> UNA means, if you have two identifiers A, B you add a triple A
>>> owl:differentFrom B at all times.
>>
>> I don't think that this mixing of different notions is making much sense.
>
> Makes totally sense to me, since they are all quite similar. Entity
> Relationship Diagram are similar to Onologies/RDF, SPARQL is often
> implemented using Relational Databases.
> The relational model https://en.wikipedia.org/wiki/Relational_model by
> Codd is consistent with first-order predicate logic as are many
> description logics, in particular a less expressive fragment was used to
> design OWL https://en.wikipedia.org/wiki/Description_logic#First_order_logic

Sorry, but you are mixing up things again here. Being "similar" is not 
enough to establish a logical relationship between two formalisms. Eve 
the underlying logic (FOL here) is just one aspect. OWL semantics is 
based on *entailment* of logical consequences in FOL. In contrast, 
Relational Algebra is based on *model checking* with respect to finite 
FOL models. The two tasks are totally and fundamentally different (model 
checking is PSpace complete, entailment checking is undecidable, for a 
start). It's beyond this thread to explain all details relevant here, 
and the somewhat vague notion of "UNA" does not really do it justice 
either (UNA is really a property of a logic's model theory, but does not 
tell you whether you are doing model checking or entailment).

>
>> Every SPARQL processor under simple semantics makes the UNA
>
> What is simple SEMANTiCS?

"Simple semantics" is the most basic way of interpreting RDF graphs. If 
you would like to know more, then you could start with the spec:

https://www.w3.org/TR/rdf11-mt/#simple-interpretations

Most SPARQL processors do not go beyond this, though their semantics is 
specified differently (based on model checking rather than on 
entailment, which makes it more natural to talk about, e.g., negation 
and aggregates). Nevertheless, the simple semantics is kind of built 
into the SPARQL BGP semantics already, so you cannot do anything less if 
you implement SPARQL.

> Primary key in SPARQL stores backed with
> relational db's often have the Quad {?g {?s ?p ?o}}as the primary key.
> De facto, UNA produces contradictions as soon as you want to state that
> to things are the same. So owl:sameAs would not make sense combined with
> UNA as it would always cause contradictions, except in the reflexive case.
> Just because you are not unifying merging identifiers right away does
> not imply UNA.

I cannot make sense of these sentences. UNA is a property of the 
semantics you use, which in turn is determined by the tool (reasoner) 
you apply. You cannot "imply UNA" -- either you implement it or you 
implement something else. How you implement equality reasoning (by 
"merging identifiers", for example) is entirely unrelated. You can 
perfectly well capture equality reasoning in a UNA system using 
auxiliary axioms. None of this has anything to do with how you identify 
quads in SPARQL.

>
>> , while RDF and OWL entailment regimes for SPARQL do not make it. This
>> has nothing to do with how you model concepts and their IDs in your
>> domain. You can have the same data and use it in different SPARQL
>> tools, sometimes with a UNA sometimes without,
> there are SPARQL tools that throw a contradiction, if they encounter
> owl:sameAs
>
>> but your choice of modelling identifiers is not affected by that.
>
> OWL was designed to handle multiple identifiers. This affects the
> modeling in a way that it is fine to have several IDs.
> DBpedia as such uses this. Below are all ID's for DBpedia Berlin., where
> the first one is the canonical one. A good idea might be to provide
> <http://dbpedia.org/pagid/3354> as well in the future. We are working on
> a service that allows to canonicalize all DBpedia Ids, which is only
> legit as there is no UNA intended in OWL.

Thanks for reminding us of the various URIs you have in DBpedia (keeping 
some connection to the topic of this thread ;-). The relationship with 
UNA is again not so relevant here. It is not true to say that you can 
only have several identifiers because OWL does not have a UNA. Instead, 
it is correct to say that asserting several identifiers to be 
semantically equal (in the sense of sameAs) is only useful if you have 
no UNA. But this statement is really trivial: a logic with UNA never has 
a built-in equality (this would be a design error). However, a logic may 
use UNA and axiomatize an equality predicate to achieve the same results 
in query answering. In some logics, you cannot even detect at all 
whether the UNA has been made or not when using positive queries (OWL QL 
is a typical example).

My main point is that none of these intricate discussions of ontology 
semantics based on mathematical logic have anything to do with the 
choice of a user to have more than one identifier for a concept. You 
will encode the fact that something is an identifier in different ways 
depending on what ontology language you use, but the discussion is 
really on another level. Many data collections we are talking about have 
no logical semantics at all, yet they may use multiple identifiers for 
one thing. I am sure that Tom's example of multiple identifiers in 
Freebase is a purely technical approach based on redirects and API 
"synonyms" without any commitment to a specific logic.

Cheers,

Markus

-- 
Markus Kroetzsch
Faculty of Computer Science
Technische Universität Dresden
+49 351 463 38486
http://korrekt.org/

------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are 
consuming the most bandwidth. Provides multi-vendor support for NetFlow, 
J-Flow, sFlow and other flows. Make informed decisions using capacity 
planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e
_______________________________________________
DBpedia-discussion mailing list
DBpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Re: [Dbpedia-discussion] Concept Identifiers

Reply via email to