On 7/4/13 11:49 AM, Olivier Berger wrote:
Hi.I hope such "design pattern" questions on consuming Linked Open Data are OT... otherwise, please suggest an appropriate venue for questions ;) I'm trying to figure out potential patterns for designing an application /consuming/ Linked Data, typically using SPARQL over a local Virtuoso triple store which was fed with harvested Linked Open Data. I happen to find resources sometimes identified with http, sometimes with https, which otherwise reference the same URL. Other issues may be the use or not of a trailing slash for dir-like URLs. For instance, I'd like to match as "identical" two doap:Projects resources which have "same" doap:homepage if I can match http://project1/example.com/home/ and https://project1/example.com/home/ It may happen that a document is rendered the same by the publishing service, whichever way it is accessed, so I'd like to consider that referencing it via URIs which contain htpp:// or https:// is equivalent. Or a service may have chosen to adopt https:// as a canonical URI for instance, but it may happen that users reference it via http somewhere else... Obviously, direct matching of the same ?h URIRef won't work in basic SPARQL queries like : PREFIX doap: <http://usefulinc.com/ns/doap#> SELECT * { GRAPH <htpp://myapp.example.com/graphs?source=http://publisher1.example.com/> { ?dp doap:homepage ?h. ?dp doap:name ?dn } GRAPH <htpp://myapp.example.com/graphs?source=https://publisher2.example.com/> { ?ap doap:homepage ?h. ?ap doap:name ?an } } I can think of a sort of Regexp matching on the string after '://' but I doubt to get good performance ;-) Is there a way to create indexes over some URIs, or owl:sameAs relations to manage such URI matching in queries ? Or am I left to "normalizing" my URLs in the harvested data before storing them in the triple store ? Would you think there's a reasonably standard approach... or one that would work with Virtuoso 6.1.3 ? ;) I imagine that this is a kinda FAQ for consuming Linked (Open) Data... but it seems many more people are concerned on publishing than on consuming in public discussions ;-) Thanks in advance. P.S.: already posted a similar question on http://answers.semanticweb.com/questions/23584/matching-ressources-with-variying-url-scheme-http-https
This is an example of what I mean by *explicit* entity relationship semantics that RDF uniquely brings to the table re. enhancements to the basic EAV/CR model and Linked Data. At this juncture, you are dealing with basic structured data and (at best) *implicit* rather than *explicit* machine- and human-comprehensible entity relationship semantics.
Situation:You have the relation doap:homepage, but its semantics aren't clear to you or your user agent. Now, let's leverage some basic RDF and Linked Data to look-up the semantics of the doap:homepage relation and we find:
1. http://lod.openlinksw.com/describe/?url=http%3A%2F%2Fusefulinc.com%2Fns%2Fdoap%23homepage&graph=http%3A%2F%2Fschemapedia.com%2Fschemas%2F -- its an inversFunctionalProperty (IFP)
2. http://lod.openlinksw.com/describe/?url=http%3A%2F%2Fwww.w3.org%2F2002%2F07%2Fowl%23InverseFunctionalProperty&graph=http%3A%2F%2Fschemapedia.com%2Fschemas%2F -- inverseFunctional property description (a little sparse).
Using relationship semantics "reasoning" and "inference" an RDF processor can determine that the subjects (irrespective of how they are denoted/named) of the doap:homepage relation share a common referent. I also posted an IFP exploitation example using SPARQL a while back [1].
Conclusion: just leverage RDF semantics, forget about regexing anything, and you have a first-class demonstration of what RDF actually adds to Linked Data :-)
Links:[1] http://bit.ly/Y6TIfs -- Using SPARQL to Integrate Disparate Data via InverseFunctionalProperty (IFP) relations .
-- Regards, Kingsley Idehen Founder & CEO OpenLink Software Company Web: http://www.openlinksw.com Personal Weblog: http://www.openlinksw.com/blog/~kidehen Twitter/Identi.ca handle: @kidehen Google+ Profile: https://plus.google.com/112399767740508618350/about LinkedIn Profile: http://www.linkedin.com/in/kidehen
smime.p7s
Description: S/MIME Cryptographic Signature
