On 4/3/14 7:47 AM, Bart Vandewoestyne wrote:
Hello list,INITIAL REMARK: if this is not the appropriate mailing list for this question, please let me know the best place to ask questions regarding SPARQL queries and their optimization. I'm a beginner when it comes to writing SPARQL queries. I am trying to speed up a certain query that I got from someone, which has the following form: SELECT ?val (COUNT(DISTINCT ?id) as ?vc) WHERE { ?id<http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?val. ?id ?property ?property_value. ?property_value bif:contains "'foo'". ?id ?property1 ?property_value1. ?property_value1 bif:contains "'bar'". } GROUP BY ?val ORDER BY DESC(?vc) First of all, I noticed that I can write it more elegantly as follows: SELECT ?val (COUNT(DISTINCT ?id) as ?vc) WHERE { ?id<http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?val ; ?property1 ?property_value1 ; ?property2 ?property_value2 . ?property_value1 bif:contains "'foo'" . ?property_value2 bif:contains "'bar'" . } GROUP BY ?val ORDER BY DESC(?vc) Secondly, looking at http://docs.openlinksw.com/virtuoso/queryingftcols.html my educated guess was that I could replace this query with SELECT ?val (COUNT(DISTINCT ?id) as ?vc) WHERE { ?id<http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?val ; ?property ?property_value . ?property_value bif:contains 'foo or bar' . } GROUP BY ?val ORDER BY DESC(?vc) and my hope was that this version would run a little faster (don't ask me why... just a wild guess that I would try) Unfortunately, this last version seems to give different query results. My two questions: 1) Why is this last query giving different results? Am I misinterpreting something? 2) Is there a way I can rewrite the original query so that it runs faster? Thanks! Bart
One little issue here is that you don't indicate the role of named graphs i.e., do you want this query to be scoped to every named graph or to specific named graphs? As you can imagine, this has impact on the perceived performance.
You could use the public LOD Cloud instance to demonstrate your quest, and share a SPARQL query results URL for accelerated analysis (on our side).
[1] http://lod.openlinksw.com/sparql -- LOD Cloud Cache Instance (50 Billion+ Triples)
-- Regards, Kingsley Idehen Founder & CEO OpenLink Software Company Web: http://www.openlinksw.com Personal Weblog: http://www.openlinksw.com/blog/~kidehen Twitter Profile: https://twitter.com/kidehen Google+ Profile: https://plus.google.com/+KingsleyIdehen/about LinkedIn Profile: http://www.linkedin.com/in/kidehen
smime.p7s
Description: S/MIME Cryptographic Signature
------------------------------------------------------------------------------
_______________________________________________ Virtuoso-users mailing list Virtuoso-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/virtuoso-users