Re: [Wikidata-l] OWL based ontologies as basis for Wikidata item interactions and property proposal

2015-04-06 Thread Markus Krötzsch

Hi Benjamin,

Thanks for clarifying. I see your problem and I agree with your 
approach. In fact, I think Webprotege is a big step forward in terms of 
collaborative ontology editing. One could certainly improve this much 
further, but there are many good ideas there. I am not sure that it 
would have the right workflows for working with a group more than, say, 
a dozen people on one part of an ontology, but maybe this is not needed 
for you now. It's also great as a kind of etherpad/Google Docs for 
ontologies.


Protege is not just OWL, so one could imagine WebProtege to support 
Wikidata-specific modelling. Maybe this would even be interesting as a 
project for the creators of the software. The question is what we would 
need there in terms of structure.


Regarding interaction with LOD sources, one has more options. In 
particular, one can export simplified statements (possibly filtered 
based on some of the statements' features) using simple RDF triples 
(giving up qualifiers). One can also work with seeAlso links to point to 
external resources without making a detailed statement about the 
relationship of this resource to an item. I think we can do a lot there 
without running into the difficulties you get when accurately 
representing statements in RDF and applying OWL to them.


Cheers,

Markus


On 06.04.2015 22:58, Benjamin Good wrote:

Hi Markus,

Thanks for your responses.  Markus, I think the point that Sebastian was
raising has more to do with practices for communities working on data
modeling for wikidata than specifically about OWL semantics.  Let me
explain a little further.  We are a group of 3-7 (depending on the week)
people working collaboratively on the task of loading wikidata with
content linking genes, diseases, and drugs.  Even amongst this small
group, we have struggled to keep our data modeling discussions orderly
and productive - even before entering into these discussions with the
broader community.  Its a constant struggle to see the big picture.  One
of the main reasons for this (IMHO) is the lack of ways to view the
structure of the model that we are assembling as its being figured out.
This is a consequence of wikidata's schema-free design.  e.g. on
Freebase this problem was addressed using their Type system.  For a
given kind of thing, you could create/find a Type to describe it and
there you could argue about what set of properties were most useful for
representing things of that Type.  Wikidata seems to want to deal with
things one property at a time - which is fine until you want to come up
with a coherent collection of a number of related properties and
associated constraints that cover a particular domain.  For that purpose
an ontology and tools for looking at and thinking about the ontology
become very useful.  So..  currently we are experimenting with
webprotege as a place to collaboratively work through our data models
before entering into discussions on wikidata itself.  Thoughts on that
as a pattern for collaboration would be helpful - could/should we be
doing this all in wikidata?  Would some interface improvements be
possible that facilitated schema-level views and discussions?

The idea of working in OWL (though note that we are not currently using
any semantics beyond RDF-S) provides the added potential bonuses of
facilitating import/export and mappings to other linked data sources,
but this is really secondary to the social management challenge.

Emw,
We have not explicitly attempted to force alignment with BFO or OBO -
though we have been in touch with Chris Mungall about this and would
welcome help with such alignments either on webprotege or on wiki.  We
are driven very pragmatically based on the requirements generated by the
data sources that are next on the list for import but, as the ontology
discussion should indicate, want to do our best to help generate a clean
and effective model for the community to build upon.

-Ben




On Mon, Apr 6, 2015 at 1:03 PM, Markus Krötzsch
mailto:[email protected]>>
wrote:

On 06.04.2015 22:02, Markus Krötzsch wrote:

Dear Sebastian,

Using OWL is surely a nice idea when the semantics is
appropriate (i.e.,
where you want Open-World entailment, not constraints) and here the


Possibly misleading typo: I meant "where", not "here" ;-) -- Markus



expressiveness is enough. This is much more difficult, however,
than one
might at first think it is. For a simple example, the common
Wikidata
constraint that a property is /symmetric/ can not be expressed
in OWL.
The reason is that, in order to represent statements with
references and
qualifiers in OWL (i.e., in RDF), one needs to introduce auxiliary
individuals for statements. I discussed some of these
limitations of OWL
in my keynote at the "OWL: Experiences and Directions" workshop
2012,
but it seems the slides

Re: [Wikidata-l] OWL based ontologies as basis for Wikidata item interactions and property proposal

2015-04-06 Thread Benjamin Good
Hi Markus,

Thanks for your responses.  Markus, I think the point that Sebastian was
raising has more to do with practices for communities working on data
modeling for wikidata than specifically about OWL semantics.  Let me
explain a little further.  We are a group of 3-7 (depending on the week)
people working collaboratively on the task of loading wikidata with content
linking genes, diseases, and drugs.  Even amongst this small group, we have
struggled to keep our data modeling discussions orderly and productive -
even before entering into these discussions with the broader community.
Its a constant struggle to see the big picture.  One of the main reasons
for this (IMHO) is the lack of ways to view the structure of the model that
we are assembling as its being figured out.  This is a consequence of
wikidata's schema-free design.  e.g. on Freebase this problem was addressed
using their Type system.  For a given kind of thing, you could create/find
a Type to describe it and there you could argue about what set of
properties were most useful for representing things of that Type.  Wikidata
seems to want to deal with things one property at a time - which is fine
until you want to come up with a coherent collection of a number of related
properties and associated constraints that cover a particular domain.  For
that purpose an ontology and tools for looking at and thinking about the
ontology become very useful.  So..  currently we are experimenting with
webprotege as a place to collaboratively work through our data models
before entering into discussions on wikidata itself.  Thoughts on that as a
pattern for collaboration would be helpful - could/should we be doing this
all in wikidata?  Would some interface improvements be possible that
facilitated schema-level views and discussions?

The idea of working in OWL (though note that we are not currently using any
semantics beyond RDF-S) provides the added potential bonuses of
facilitating import/export and mappings to other linked data sources, but
this is really secondary to the social management challenge.

Emw,
We have not explicitly attempted to force alignment with BFO or OBO -
though we have been in touch with Chris Mungall about this and would
welcome help with such alignments either on webprotege or on wiki.  We are
driven very pragmatically based on the requirements generated by the data
sources that are next on the list for import but, as the ontology
discussion should indicate, want to do our best to help generate a clean
and effective model for the community to build upon.

-Ben




On Mon, Apr 6, 2015 at 1:03 PM, Markus Krötzsch <
[email protected]> wrote:

> On 06.04.2015 22:02, Markus Krötzsch wrote:
>
>> Dear Sebastian,
>>
>> Using OWL is surely a nice idea when the semantics is appropriate (i.e.,
>> where you want Open-World entailment, not constraints) and here the
>>
>
> Possibly misleading typo: I meant "where", not "here" ;-) -- Markus
>
>
>
>  expressiveness is enough. This is much more difficult, however, than one
>> might at first think it is. For a simple example, the common Wikidata
>> constraint that a property is /symmetric/ can not be expressed in OWL.
>> The reason is that, in order to represent statements with references and
>> qualifiers in OWL (i.e., in RDF), one needs to introduce auxiliary
>> individuals for statements. I discussed some of these limitations of OWL
>> in my keynote at the "OWL: Experiences and Directions" workshop 2012,
>> but it seems the slides are not on the web site. I will try if I can
>> track them down and publish them somewhere.
>>
>> Maybe you already have observed these limitations yourself? I was not
>> sure from your email (and the linked documents) how exactly you envision
>> the use of OWL. One thing that is clear is that OWL does cannot be used
>> on Wikidata directly, but only on an RDF version of it. For this reason,
>> you should also have a look at the (many) ongoing discussions about the
>> final details of this RDF model. You can find related issue reports on
>> Phabricator. I think it is also fairly safe to base your work on the
>> published RDF export (see our paper at ISWC 2014): there will be
>> changes, but the basic structural aspects that matter for creating OWL
>> statements will most likely be the same. The paper also contains some
>> discussion of how current Wikidata constraints can be mapped to OWL
>> (which of course is not the semantics that constraints have).
>>
>>
>> Maybe I should explain this again in detail, since some of these issues
>> do not seem to be completely clear to the Wikidata community right now.
>> For example, you can see things like Wikidata's "instance of" (P31)
>> being declared to be "equivalent" (P1628) to rdf:type. Of course, this
>> "equivalence" is only an informal notion that refers to the common ideas
>> of classification that are embodied in both "properties" (note that they
>> are both called "properties" but that there are fundamental

Re: [Wikidata-l] OWL based ontologies as basis for Wikidata item interactions and property proposal

2015-04-06 Thread Markus Krötzsch

On 06.04.2015 22:02, Markus Krötzsch wrote:

Dear Sebastian,

Using OWL is surely a nice idea when the semantics is appropriate (i.e.,
where you want Open-World entailment, not constraints) and here the


Possibly misleading typo: I meant "where", not "here" ;-) -- Markus



expressiveness is enough. This is much more difficult, however, than one
might at first think it is. For a simple example, the common Wikidata
constraint that a property is /symmetric/ can not be expressed in OWL.
The reason is that, in order to represent statements with references and
qualifiers in OWL (i.e., in RDF), one needs to introduce auxiliary
individuals for statements. I discussed some of these limitations of OWL
in my keynote at the "OWL: Experiences and Directions" workshop 2012,
but it seems the slides are not on the web site. I will try if I can
track them down and publish them somewhere.

Maybe you already have observed these limitations yourself? I was not
sure from your email (and the linked documents) how exactly you envision
the use of OWL. One thing that is clear is that OWL does cannot be used
on Wikidata directly, but only on an RDF version of it. For this reason,
you should also have a look at the (many) ongoing discussions about the
final details of this RDF model. You can find related issue reports on
Phabricator. I think it is also fairly safe to base your work on the
published RDF export (see our paper at ISWC 2014): there will be
changes, but the basic structural aspects that matter for creating OWL
statements will most likely be the same. The paper also contains some
discussion of how current Wikidata constraints can be mapped to OWL
(which of course is not the semantics that constraints have).


Maybe I should explain this again in detail, since some of these issues
do not seem to be completely clear to the Wikidata community right now.
For example, you can see things like Wikidata's "instance of" (P31)
being declared to be "equivalent" (P1628) to rdf:type. Of course, this
"equivalence" is only an informal notion that refers to the common ideas
of classification that are embodied in both "properties" (note that they
are both called "properties" but that there are fundamental differences
between RDF properties and Wikidata properties -- again, they are
closely related in spirit but not in a precise formal way). In
particular, there is no semantic framework where P31 and rdf:type
coexist, so it does not make sense to declare them "equivalent" in any
stronger way. The best we can do is to translate Wikidata data into RDF,
but after this translation, there is no "P31" any more: instead, there
are several RDF properties that are used together to encode P31
statements, and none of these RDF properties is "equivalent" to rdf:type.

OWL is part of the RDF world, and it only has meaning in this context --
you cannot apply OWL to Wikidata contents directly. You can certainly
apply OWL to RDF exported from Wikidata. However, it you want the
resulting conclusions to be "first class" statements in the Wikidata
world (as you seem to suggest), then you need to use an RDF encoding
that faithfully captures all data in Wikidata. This is the reason why
OWL can express the symmetry of RDF properties, but not the symmetry of
Wikidata properties.

Best regards,

Markus


On 03.04.2015 11:16, Sebastian Burgstaller wrote:

Hello all,

Wikidata consists of millions of single data items, which is great. In
order to facilitate modeling the interactions between the single items,
we hereby suggest using OWL based ontologies
(http://en.wikipedia.org/wiki/Web_Ontology_Language).

We think that using ontologies brings several advantages:
-Looking at an ontology (could collaboratively be generated e.g. on
webprotege.stanford.edu ) gives a very
clear overview of how data is interconnected. This would allow for
modeling of even very large and/or complex interactions.
-Layouting a data integration project in an ontology first, before
really integrating data into WD facilitates property proposal, as a
ontology with its properties could first be designed and then the
ontology with all its properties and classes could be generated as a
whole.
-Data could be queried/exported from WD based on an ontology by simply
selecting the whole or parts of an ontology.

This approach has been suggested and discussed by Benjamin Good, Elvira
Mitraka, Andra Wagmeester, Andrew Su and me. As an example, we put
together draft properties for gene disease interactions, which allows
for WD community discussion of this apporach. A preliminary version can
be found here:
https://www.wikidata.org/wiki/User:ProteinBoxBot/GeneDiseaseIteraction_Discussion


Best regards,

Sebastian


___
Wikidata-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-l






___
Wikidata-l mailing list
[email protected]

Re: [Wikidata-l] OWL based ontologies as basis for Wikidata item interactions and property proposal

2015-04-06 Thread Markus Krötzsch

Dear Sebastian,

Using OWL is surely a nice idea when the semantics is appropriate (i.e., 
where you want Open-World entailment, not constraints) and here the 
expressiveness is enough. This is much more difficult, however, than one 
might at first think it is. For a simple example, the common Wikidata 
constraint that a property is /symmetric/ can not be expressed in OWL. 
The reason is that, in order to represent statements with references and 
qualifiers in OWL (i.e., in RDF), one needs to introduce auxiliary 
individuals for statements. I discussed some of these limitations of OWL 
in my keynote at the "OWL: Experiences and Directions" workshop 2012, 
but it seems the slides are not on the web site. I will try if I can 
track them down and publish them somewhere.


Maybe you already have observed these limitations yourself? I was not 
sure from your email (and the linked documents) how exactly you envision 
the use of OWL. One thing that is clear is that OWL does cannot be used 
on Wikidata directly, but only on an RDF version of it. For this reason, 
you should also have a look at the (many) ongoing discussions about the 
final details of this RDF model. You can find related issue reports on 
Phabricator. I think it is also fairly safe to base your work on the 
published RDF export (see our paper at ISWC 2014): there will be 
changes, but the basic structural aspects that matter for creating OWL 
statements will most likely be the same. The paper also contains some 
discussion of how current Wikidata constraints can be mapped to OWL 
(which of course is not the semantics that constraints have).



Maybe I should explain this again in detail, since some of these issues 
do not seem to be completely clear to the Wikidata community right now. 
For example, you can see things like Wikidata's "instance of" (P31) 
being declared to be "equivalent" (P1628) to rdf:type. Of course, this 
"equivalence" is only an informal notion that refers to the common ideas 
of classification that are embodied in both "properties" (note that they 
are both called "properties" but that there are fundamental differences 
between RDF properties and Wikidata properties -- again, they are 
closely related in spirit but not in a precise formal way). In 
particular, there is no semantic framework where P31 and rdf:type 
coexist, so it does not make sense to declare them "equivalent" in any 
stronger way. The best we can do is to translate Wikidata data into RDF, 
but after this translation, there is no "P31" any more: instead, there 
are several RDF properties that are used together to encode P31 
statements, and none of these RDF properties is "equivalent" to rdf:type.


OWL is part of the RDF world, and it only has meaning in this context -- 
you cannot apply OWL to Wikidata contents directly. You can certainly 
apply OWL to RDF exported from Wikidata. However, it you want the 
resulting conclusions to be "first class" statements in the Wikidata 
world (as you seem to suggest), then you need to use an RDF encoding 
that faithfully captures all data in Wikidata. This is the reason why 
OWL can express the symmetry of RDF properties, but not the symmetry of 
Wikidata properties.


Best regards,

Markus


On 03.04.2015 11:16, Sebastian Burgstaller wrote:

Hello all,

Wikidata consists of millions of single data items, which is great. In
order to facilitate modeling the interactions between the single items,
we hereby suggest using OWL based ontologies
(http://en.wikipedia.org/wiki/Web_Ontology_Language).

We think that using ontologies brings several advantages:
-Looking at an ontology (could collaboratively be generated e.g. on
webprotege.stanford.edu ) gives a very
clear overview of how data is interconnected. This would allow for
modeling of even very large and/or complex interactions.
-Layouting a data integration project in an ontology first, before
really integrating data into WD facilitates property proposal, as a
ontology with its properties could first be designed and then the
ontology with all its properties and classes could be generated as a whole.
-Data could be queried/exported from WD based on an ontology by simply
selecting the whole or parts of an ontology.

This approach has been suggested and discussed by Benjamin Good, Elvira
Mitraka, Andra Wagmeester, Andrew Su and me. As an example, we put
together draft properties for gene disease interactions, which allows
for WD community discussion of this apporach. A preliminary version can
be found here:
https://www.wikidata.org/wiki/User:ProteinBoxBot/GeneDiseaseIteraction_Discussion

Best regards,

Sebastian


___
Wikidata-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-l




___
Wikidata-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] OWL based ontologies as basis for Wikidata item interactions and property proposal

2015-04-04 Thread Emw
Sebastian, Benjamin, Elvira, Andra, Andrew,

Kudos on your progress with an OWL-centric approach to knowledge
representation.  The community has been incorporating OWL concepts into
property definitions and ontology development on-wiki for some time, but
yours is the first Wikidata group I'm aware of that has incorporated
Protege into the process.

We think that using ontologies brings several advantages


The examples you cite seem like good ideas and I support them.

I would also suggest considering how the Wikidata ontologies we develop fit
into established ontologies in the Semantic Web.  For example, the OBO
Foundry (http://www.obofoundry.org/) is by far the world's most widely used
group of biomedical ontologies [1, 2].  Those ontologies are rooted in the
Basic Formal Ontology (BFO).  OWL helps a great deal in being interoperable
with those works, but a further ontological commitment tends to be needed
for easy compatibility.

Is your gene-disease interaction ontology compatible with BFO, and the OBO
ontologies rooted in it?

Cheers,
Eric

https://www.wikidata.org/wiki/User:Emw

1.  http://www.nature.com/nbt/journal/v25/n11/full/nbt1346.html
2.  https://scholar.google.com/scholar?cites=13806088078865650870
___
Wikidata-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] OWL based ontologies as basis for Wikidata item interactions and property proposal

2015-04-03 Thread Jane Darnell
Interesting approach, and one I would support. I have been against forcing
Wikidata into any other "jacket" than one of its own knitting, but this
approach makes OWL look like any other external database that may or may
not come with properties worth integrating into Wikidata's "jacket"

On Fri, Apr 3, 2015 at 11:16 AM, Sebastian Burgstaller <
[email protected]> wrote:

> Hello all,
>
> Wikidata consists of millions of single data items, which is great. In
> order to facilitate modeling the interactions between the single items, we
> hereby suggest using OWL based ontologies (
> http://en.wikipedia.org/wiki/Web_Ontology_Language).
>
> We think that using ontologies brings several advantages:
> -Looking at an ontology (could collaboratively be generated e.g. on
> webprotege.stanford.edu) gives a very clear overview of how data is
> interconnected. This would allow for modeling of even very large and/or
> complex interactions.
> -Layouting a data integration project in an ontology first, before really
> integrating data into WD facilitates property proposal, as a ontology with
> its properties could first be designed and then the ontology with all its
> properties and classes could be generated as a whole.
> -Data could be queried/exported from WD based on an ontology by simply
> selecting the whole or parts of an ontology.
>
> This approach has been suggested and discussed by Benjamin Good, Elvira
> Mitraka, Andra Wagmeester, Andrew Su and me. As an example, we put together
> draft properties for gene disease interactions, which allows for WD
> community discussion of this apporach. A preliminary version can be found
> here:
> https://www.wikidata.org/wiki/User:ProteinBoxBot/GeneDiseaseIteraction_Discussion
>
> Best regards,
>
> Sebastian
>
> ___
> Wikidata-l mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>
>
___
Wikidata-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-l