Re: vocabularies and data alignment
Hugh Glaser wrote: Hi, To put it in simple terms for me :-) Are you after the algorithms we use to identify when two instances are the same? Best Hugh Yes ! François On 11/06/2009 12:57, François Scharffe francois.schar...@inria.fr wrote: Dear LODers, There has been a couple of discussions already on this list on the need for a vocabulary to represent correspondences between terms of different vocabularies. We also saw recently various tools (e.g. Silk, ODDlinker) allowing to automatically interlink datasets given a specification of what should be linked. However, there is currently no common way to publish and share this information (i.e., not the links but the way to generate them, see [1] for precision). We are setting up an experiment [1] to see if it is possible to provide useful services from this data. But for that purpose we need your help. So this is a call for contribution: we are collecting any specification of link generator for the LOD graph. Of course, do not hesitate to comment on the idea or to tell us if you want to be involved. We promise a report on this by the end of summer (northern hemisphere :). Cheers, François [1] http://melinda.inrialpes.fr begin:vcard fn;quoted-printable:Fran=C3=A7ois Scharffe n;quoted-printable:Scharffe;Fran=C3=A7ois org:INRIA;EXMO adr;quoted-printable:655 avenue de l'Europe;;ICT Bureau B212, B=C3=A2timent Inria Innovallee - INRIA Rhone-Alpes;Montbonnot;;38334 Saint-Ismier cedex;France email;internet:francois.schar...@inrialpes.fr title:PhD tel;work:+33 (0)476 61 52 63 tel;home:don't have tel;cell:0033 667 19 09 31 note;quoted-printable:home page:=0D=0A= http://www.scharffe.fr x-mozilla-html:FALSE url:http://www.inrialpes.fr version:2.1 end:vcard
Re: vocabularies and data alignment
Hi, * Francois said, initially: There has been a couple of discussions already on this list on the need for a vocabulary to represent correspondences between terms of different vocabularies * Hugh asked: Are you after the algorithms we use to identify when two instances are the same? * Francois says: Yes! So, terms or instances? Thanks, A On Fri, Jun 12, 2009 at 3:57 AM, François Scharffefrancois.schar...@inria.fr wrote: Hugh Glaser wrote: Hi, To put it in simple terms for me :-) Are you after the algorithms we use to identify when two instances are the same? Best Hugh Yes ! François On 11/06/2009 12:57, François Scharffe francois.schar...@inria.fr wrote: Dear LODers, There has been a couple of discussions already on this list on the need for a vocabulary to represent correspondences between terms of different vocabularies. We also saw recently various tools (e.g. Silk, ODDlinker) allowing to automatically interlink datasets given a specification of what should be linked. However, there is currently no common way to publish and share this information (i.e., not the links but the way to generate them, see [1] for precision). We are setting up an experiment [1] to see if it is possible to provide useful services from this data. But for that purpose we need your help. So this is a call for contribution: we are collecting any specification of link generator for the LOD graph. Of course, do not hesitate to comment on the idea or to tell us if you want to be involved. We promise a report on this by the end of summer (northern hemisphere :). Cheers, François [1] http://melinda.inrialpes.fr -- Aldo Bucchi U N I V R Z Office: +56 2 795 4532 Mobile:+56 9 7623 8653 skype:aldo.bucchi http://www.univrz.com/ http://aldobucchi.com/ PRIVILEGED AND CONFIDENTIAL INFORMATION This message is only for the use of the individual or entity to which it is addressed and may contain information that is privileged and confidential. If you are not the intended recipient, please do not distribute or copy this communication, by e-mail or otherwise. Instead, please notify us immediately by return e-mail. INFORMACIÓN PRIVILEGIADA Y CONFIDENCIAL Este mensaje está destinado sólo a la persona u organización al cual está dirigido y podría contener información privilegiada y confidencial. Si usted no es el destinatario, por favor no distribuya ni copie esta comunicación, por email o por otra vía. Por el contrario, por favor notifíquenos inmediatamente vía e-mail.
Re: vocabularies and data alignment
Hi, I think we all want to do the same thing: aligning vocabularies in order to enhance interoperability in the cloud. There is the top-down approach: aligning vocabularies first, and using the alignment to provide functionalities for the data. For example, if Google releases a dataset using their googlevocab ontology, we could use an alignment between it and FOAF to tell a link generator under which classes similar instances will be found. (advertisement: providing alignments is the task of the alignment server [1]). With this experiment we want to go in the other direction, from data to vocabularies. We take this path by looking at link generator specifications, what they need as input, and see in what degree it would be possible to abstract this input as a kind of ontology alignment. This could provide functionalities such as aggregation, composition and verification of links and alignment (see detailed description [2]). Cheers François [1] http://aserv.inrialpes.fr/ [2] http://melinda.inrialpes.fr Kingsley Idehen wrote: Hugh Glaser wrote: Hi, To put it in simple terms for me :-) Are you after the algorithms we use to identify when two instances are the same? No, he isn't talking ABox. He is talking TBox (data dictionary). I posted a link about a simple mapper ontology for Google's RDF vocabs that basically prevent the innocent from using those terms and ending up down a swamp (to put things as mildly as possible). See: http://purl.org/NET/googlevocab# UMBEL is about doing this on bigger and broader scales :-) That's always been the purpose of this project since inception. Kingsley Best Hugh On 11/06/2009 12:57, François Scharffe francois.schar...@inria.fr wrote: Dear LODers, There has been a couple of discussions already on this list on the need for a vocabulary to represent correspondences between terms of different vocabularies. We also saw recently various tools (e.g. Silk, ODDlinker) allowing to automatically interlink datasets given a specification of what should be linked. However, there is currently no common way to publish and share this information (i.e., not the links but the way to generate them, see [1] for precision). We are setting up an experiment [1] to see if it is possible to provide useful services from this data. But for that purpose we need your help. So this is a call for contribution: we are collecting any specification of link generator for the LOD graph. Of course, do not hesitate to comment on the idea or to tell us if you want to be involved. We promise a report on this by the end of summer (northern hemisphere :). Cheers, François [1] http://melinda.inrialpes.fr begin:vcard fn;quoted-printable:Fran=C3=A7ois Scharffe n;quoted-printable:Scharffe;Fran=C3=A7ois org:INRIA;EXMO adr;quoted-printable:655 avenue de l'Europe;;ICT Bureau B212, B=C3=A2timent Inria Innovallee - INRIA Rhone-Alpes;Montbonnot;;38334 Saint-Ismier cedex;France email;internet:francois.schar...@inrialpes.fr title:PhD tel;work:+33 (0)476 61 52 63 tel;home:don't have tel;cell:0033 667 19 09 31 note;quoted-printable:home page:=0D=0A= http://www.scharffe.fr x-mozilla-html:FALSE url:http://www.inrialpes.fr version:2.1 end:vcard
Re: vocabularies and data alignment
François Scharffe wrote: Hugh Glaser wrote: Hi, To put it in simple terms for me :-) Are you after the algorithms we use to identify when two instances are the same? Best Hugh Yes ! François So if the answer is Yes. Then do you mean things in the ABox and TBox? Must be clear here as being too generic leads to confusion. sameAs is not the best way to align things in the TBox. Kingsley On 11/06/2009 12:57, François Scharffe francois.schar...@inria.fr wrote: Dear LODers, There has been a couple of discussions already on this list on the need for a vocabulary to represent correspondences between terms of different vocabularies. We also saw recently various tools (e.g. Silk, ODDlinker) allowing to automatically interlink datasets given a specification of what should be linked. However, there is currently no common way to publish and share this information (i.e., not the links but the way to generate them, see [1] for precision). We are setting up an experiment [1] to see if it is possible to provide useful services from this data. But for that purpose we need your help. So this is a call for contribution: we are collecting any specification of link generator for the LOD graph. Of course, do not hesitate to comment on the idea or to tell us if you want to be involved. We promise a report on this by the end of summer (northern hemisphere :). Cheers, François [1] http://melinda.inrialpes.fr -- Regards, Kingsley Idehen Weblog: http://www.openlinksw.com/blog/~kidehen President CEO OpenLink Software Web: http://www.openlinksw.com
Re: vocabularies and data alignment
François Scharffe wrote: Kingsley Idehen wrote: François Scharffe wrote: Hugh Glaser wrote: Hi, To put it in simple terms for me :-) Are you after the algorithms we use to identify when two instances are the same? Best Hugh Yes ! François So if the answer is Yes. Then do you mean things in the ABox and TBox? Must be clear here as being too generic leads to confusion. Link generators are working at the instance level (ABox), they generate links between instances. They need some input, a specification of what should be interlinked. We think this specification can be lifted to an alignment between vocabularies (TBoxes). Well we are not 100% sure this will work, that's why we would like to get such tools and their linkage specifications. I can take an example, interlinking persons: one dataset is described with FOAF, the other with VCard. ?x foaf:name ?name. ?y vc:n [ vc:family-name ?fn; vc:given-name ?gn. ]. the linkage specification might be something like: if compare(?name, concat(?gn, ,?fn)) threshold then output(?x owl:sameAs ?y) Fine, that's an instance data (ABox) oriented equivalence algorithm. In fact, this specification says foaf:name - concat(vc:given-name, ,vc:family-name) which is an alignment at the TBox level that can be lifted from the linkage specification. In the TBox you would be the properties are either equivalent or one property is a sub property of the other. Once done, reasoners can then navigate the instance data via the TBox mappings. This is basically a major aspect of the UMBEL project. Even in its current form, if you taking the alignment rules (expressed in OWL) you have a pretty rich bases for leveraging linkages across many shared ontologies. To extend, you simply find you slot, and map to that. which is back to the: embraces and extend principle. Anyway, your response provides clarity, including the fact that the end product of this effort isn't a solely about a bag of ABox oriented owl:sameAs links :-) As I've stated before, coherent Linked Data magic happens, when we exploit the power of TBox level mapping across disparate ontologies. Deceptively Simple always trumps Simply Simple over the long haul, the latter simply doesn't scale :-) Kingsley I hope I was clear enough this time ;) Cheers, François sameAs is not the best way to align things in the TBox. Kingsley On 11/06/2009 12:57, François Scharffe francois.schar...@inria.fr wrote: Dear LODers, There has been a couple of discussions already on this list on the need for a vocabulary to represent correspondences between terms of different vocabularies. We also saw recently various tools (e.g. Silk, ODDlinker) allowing to automatically interlink datasets given a specification of what should be linked. However, there is currently no common way to publish and share this information (i.e., not the links but the way to generate them, see [1] for precision). We are setting up an experiment [1] to see if it is possible to provide useful services from this data. But for that purpose we need your help. So this is a call for contribution: we are collecting any specification of link generator for the LOD graph. Of course, do not hesitate to comment on the idea or to tell us if you want to be involved. We promise a report on this by the end of summer (northern hemisphere :). Cheers, François [1] http://melinda.inrialpes.fr -- Regards, Kingsley Idehen Weblog: http://www.openlinksw.com/blog/~kidehen President CEO OpenLink Software Web: http://www.openlinksw.com
Re: vocabularies and data alignment
François, On Fri, Jun 12, 2009 at 9:07 AM, François Scharffefrancois.schar...@inria.fr wrote: Kingsley Idehen wrote: François Scharffe wrote: Hugh Glaser wrote: Hi, To put it in simple terms for me :-) Are you after the algorithms we use to identify when two instances are the same? Best Hugh Yes ! François So if the answer is Yes. Then do you mean things in the ABox and TBox? Must be clear here as being too generic leads to confusion. Link generators are working at the instance level (ABox), they generate links between instances. They need some input, a specification of what should be interlinked. We think this specification can be lifted to an alignment between vocabularies (TBoxes). Well we are not 100% sure this will work, that's why we would like to get such tools and their linkage specifications. I can take an example, interlinking persons: one dataset is described with FOAF, the other with VCard. ?x foaf:name ?name. ?y vc:n [ vc:family-name ?fn; vc:given-name ?gn. ]. the linkage specification might be something like: if compare(?name, concat(?gn, ,?fn)) threshold then output(?x owl:sameAs ?y) In fact, this specification says foaf:name - concat(vc:given-name, ,vc:family-name) which is an alignment at the TBox level that can be lifted from the linkage specification. I hope I was clear enough this time ;) Yes you did. Hugh got it right but I was a bit lost ;) My quick take on this issue is usually: Strategy: Use subPropertyOf, etc if possible, otherwise resort to SWRL or SPARQL, otherwise use custom code ( for example, if the IFPs are embedded in URIs ). Implementation: Stick with inference. If not possible, materialize intermediate graph. Of course the above is not very useful as what you're looking for is real world examples to mine for patterns, generalize, and try to push the knowledge up to the TBox. Good luck, it sounds interesting ;) Thanks, A Cheers, François sameAs is not the best way to align things in the TBox. Kingsley On 11/06/2009 12:57, François Scharffe francois.schar...@inria.fr wrote: Dear LODers, There has been a couple of discussions already on this list on the need for a vocabulary to represent correspondences between terms of different vocabularies. We also saw recently various tools (e.g. Silk, ODDlinker) allowing to automatically interlink datasets given a specification of what should be linked. However, there is currently no common way to publish and share this information (i.e., not the links but the way to generate them, see [1] for precision). We are setting up an experiment [1] to see if it is possible to provide useful services from this data. But for that purpose we need your help. So this is a call for contribution: we are collecting any specification of link generator for the LOD graph. Of course, do not hesitate to comment on the idea or to tell us if you want to be involved. We promise a report on this by the end of summer (northern hemisphere :). Cheers, François [1] http://melinda.inrialpes.fr -- Aldo Bucchi U N I V R Z Office: +56 2 795 4532 Mobile:+56 9 7623 8653 skype:aldo.bucchi http://www.univrz.com/ http://aldobucchi.com/ PRIVILEGED AND CONFIDENTIAL INFORMATION This message is only for the use of the individual or entity to which it is addressed and may contain information that is privileged and confidential. If you are not the intended recipient, please do not distribute or copy this communication, by e-mail or otherwise. Instead, please notify us immediately by return e-mail. INFORMACIÓN PRIVILEGIADA Y CONFIDENCIAL Este mensaje está destinado sólo a la persona u organización al cual está dirigido y podría contener información privilegiada y confidencial. Si usted no es el destinatario, por favor no distribuya ni copie esta comunicación, por email o por otra vía. Por el contrario, por favor notifíquenos inmediatamente vía e-mail.
vocabularies and data alignment
Dear LODers, There has been a couple of discussions already on this list on the need for a vocabulary to represent correspondences between terms of different vocabularies. We also saw recently various tools (e.g. Silk, ODDlinker) allowing to automatically interlink datasets given a specification of what should be linked. However, there is currently no common way to publish and share this information (i.e., not the links but the way to generate them, see [1] for precision). We are setting up an experiment [1] to see if it is possible to provide useful services from this data. But for that purpose we need your help. So this is a call for contribution: we are collecting any specification of link generator for the LOD graph. Of course, do not hesitate to comment on the idea or to tell us if you want to be involved. We promise a report on this by the end of summer (northern hemisphere :). Cheers, François [1] http://melinda.inrialpes.fr begin:vcard fn;quoted-printable:Fran=C3=A7ois Scharffe n;quoted-printable:Scharffe;Fran=C3=A7ois org:INRIA;EXMO adr;quoted-printable:655 avenue de l'Europe;;ICT Bureau B212, B=C3=A2timent Inria Innovallee - INRIA Rhone-Alpes;Montbonnot;;38334 Saint-Ismier cedex;France email;internet:francois.schar...@inrialpes.fr title:PhD tel;work:+33 (0)476 61 52 63 tel;home:don't have tel;cell:0033 667 19 09 31 note;quoted-printable:home page:=0D=0A= http://www.scharffe.fr x-mozilla-html:FALSE url:http://www.inrialpes.fr version:2.1 end:vcard
vocabularies and data alignment
Dear LODers, There has been a couple of discussions already on this list on the need for a vocabulary to represent correspondences between terms of different vocabularies. We also saw recently various tools (e.g. Silk, ODDlinker) allowing to automatically interlink datasets given a specification of what should be linked. However, there is currently no common way to publish and share this information (i.e., not the links but the way to generate them, see [1] for precision). We are setting up an experiment [1] to see if it is possible to provide useful services from this data. But for that purpose we need your help. So this is a call for contribution: we are collecting any specification of link generator for the LOD graph. Of course, do not hesitate to comment on the idea or to tell us if you want to be involved. We promise a report on this by the end of summer (northern hemisphere :). Cheers, François [1] http://melinda.inrialpes.fr begin:vcard fn;quoted-printable:Fran=C3=A7ois Scharffe n;quoted-printable:Scharffe;Fran=C3=A7ois org:INRIA;EXMO adr;quoted-printable:655 avenue de l'Europe;;ICT Bureau B212, B=C3=A2timent Inria Innovallee - INRIA Rhone-Alpes;Montbonnot;;38334 Saint-Ismier cedex;France email;internet:francois.schar...@inrialpes.fr title:PhD tel;work:+33 (0)476 61 52 63 tel;home:don't have tel;cell:0033 667 19 09 31 note;quoted-printable:home page:=0D=0A= http://www.scharffe.fr x-mozilla-html:FALSE url:http://www.inrialpes.fr version:2.1 end:vcard
Re: vocabularies and data alignment
François Scharffe wrote: Dear LODers, There has been a couple of discussions already on this list on the need for a vocabulary to represent correspondences between terms of different vocabularies. We also saw recently various tools (e.g. Silk, ODDlinker) allowing to automatically interlink datasets given a specification of what should be linked. However, there is currently no common way to publish and share this information (i.e., not the links but the way to generate them, see [1] for precision). We are setting up an experiment [1] to see if it is possible to provide useful services from this data. But for that purpose we need your help. So this is a call for contribution: we are collecting any specification of link generator for the LOD graph. Of course, do not hesitate to comment on the idea or to tell us if you want to be involved. We promise a report on this by the end of summer (northern hemisphere :). Cheers, François [1] http://melinda.inrialpes.fr Francois, With regards to the question [1]: did I miss something? How about UMBEL[2] ? An effort that was created to address these kinds of issues eons ago. I remain quite confused about the tendency to open up new patches when existing efforts (of the community variety) are already in place. Recreation from scratch doesn't scale. Rememer, Microsoft came to prominence buy groking a basic principle: Embrace and Extend. Alignment across Vocabularies, Schemas, or Ontologies is what UMBEL is/was about. Links: 1. http://melinda.inrialpes.fr/ 2. http://umbel.org -- Regards, Kingsley Idehen Weblog: http://www.openlinksw.com/blog/~kidehen President CEO OpenLink Software Web: http://www.openlinksw.com
Re: vocabularies and data alignment
François Scharffe wrote: Hi Kingsley, Thank you for the pointer. It is right Umbel provides a way to link vocabularies. There are others (e.g. SKOS, the alignment format). We will look at how what we try to do can complement Umbel and build on it instead of competing with it. Our experiment is actually focused on link-generator specifications, many exist and we would like to look at them and see how they could be published. Yes, the key thing is that this task isn't instance data only (abox) you need tbox smarts and thats basically the kind of substrate UMBEL provides. Note, UMBEL and SKOS aren't mutually exclusive :-) Kingsley Cheers ! Francois Kingsley Idehen wrote: François Scharffe wrote: Dear LODers, There has been a couple of discussions already on this list on the need for a vocabulary to represent correspondences between terms of different vocabularies. We also saw recently various tools (e.g. Silk, ODDlinker) allowing to automatically interlink datasets given a specification of what should be linked. However, there is currently no common way to publish and share this information (i.e., not the links but the way to generate them, see [1] for precision). We are setting up an experiment [1] to see if it is possible to provide useful services from this data. But for that purpose we need your help. So this is a call for contribution: we are collecting any specification of link generator for the LOD graph. Of course, do not hesitate to comment on the idea or to tell us if you want to be involved. We promise a report on this by the end of summer (northern hemisphere :). Cheers, François [1] http://melinda.inrialpes.fr Francois, With regards to the question [1]: did I miss something? How about UMBEL[2] ? An effort that was created to address these kinds of issues eons ago. I remain quite confused about the tendency to open up new patches when existing efforts (of the community variety) are already in place. Recreation from scratch doesn't scale. Rememer, Microsoft came to prominence buy groking a basic principle: Embrace and Extend. Alignment across Vocabularies, Schemas, or Ontologies is what UMBEL is/was about. Links: 1. http://melinda.inrialpes.fr/ 2. http://umbel.org -- Regards, Kingsley Idehen Weblog: http://www.openlinksw.com/blog/~kidehen President CEO OpenLink Software Web: http://www.openlinksw.com
Re: vocabularies and data alignment
Hi, To put it in simple terms for me :-) Are you after the algorithms we use to identify when two instances are the same? Best Hugh On 11/06/2009 12:57, François Scharffe francois.schar...@inria.fr wrote: Dear LODers, There has been a couple of discussions already on this list on the need for a vocabulary to represent correspondences between terms of different vocabularies. We also saw recently various tools (e.g. Silk, ODDlinker) allowing to automatically interlink datasets given a specification of what should be linked. However, there is currently no common way to publish and share this information (i.e., not the links but the way to generate them, see [1] for precision). We are setting up an experiment [1] to see if it is possible to provide useful services from this data. But for that purpose we need your help. So this is a call for contribution: we are collecting any specification of link generator for the LOD graph. Of course, do not hesitate to comment on the idea or to tell us if you want to be involved. We promise a report on this by the end of summer (northern hemisphere :). Cheers, François [1] http://melinda.inrialpes.fr
Re: vocabularies and data alignment
Hugh Glaser wrote: Hi, To put it in simple terms for me :-) Are you after the algorithms we use to identify when two instances are the same? No, he isn't talking ABox. He is talking TBox (data dictionary). I posted a link about a simple mapper ontology for Google's RDF vocabs that basically prevent the innocent from using those terms and ending up down a swamp (to put things as mildly as possible). See: http://purl.org/NET/googlevocab# UMBEL is about doing this on bigger and broader scales :-) That's always been the purpose of this project since inception. Kingsley Best Hugh On 11/06/2009 12:57, François Scharffe francois.schar...@inria.fr wrote: Dear LODers, There has been a couple of discussions already on this list on the need for a vocabulary to represent correspondences between terms of different vocabularies. We also saw recently various tools (e.g. Silk, ODDlinker) allowing to automatically interlink datasets given a specification of what should be linked. However, there is currently no common way to publish and share this information (i.e., not the links but the way to generate them, see [1] for precision). We are setting up an experiment [1] to see if it is possible to provide useful services from this data. But for that purpose we need your help. So this is a call for contribution: we are collecting any specification of link generator for the LOD graph. Of course, do not hesitate to comment on the idea or to tell us if you want to be involved. We promise a report on this by the end of summer (northern hemisphere :). Cheers, François [1] http://melinda.inrialpes.fr -- Regards, Kingsley Idehen Weblog: http://www.openlinksw.com/blog/~kidehen President CEO OpenLink Software Web: http://www.openlinksw.com