On 4/3/14 7:47 AM, Bart Vandewoestyne wrote:
Hello list,

INITIAL REMARK: if this is not the appropriate mailing list for this
question, please let me know the best place to ask questions regarding
SPARQL queries and their optimization.

I'm a beginner when it comes to writing SPARQL queries.  I am trying to
speed up a certain query that I got from someone, which has the
following form:

SELECT ?val (COUNT(DISTINCT ?id) as ?vc)
WHERE
{
    ?id<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>  ?val.
    ?id ?property ?property_value.
    ?property_value bif:contains "'foo'".
    ?id ?property1 ?property_value1.
    ?property_value1 bif:contains "'bar'".
}
GROUP BY ?val
ORDER BY DESC(?vc)


First of all, I noticed that I can write it more elegantly as follows:

SELECT ?val (COUNT(DISTINCT ?id) as ?vc)
WHERE
{
    ?id<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>  ?val ;
        ?property1 ?property_value1 ;
        ?property2 ?property_value2 .

    ?property_value1 bif:contains "'foo'" .

    ?property_value2 bif:contains "'bar'" .
}
GROUP BY ?val
ORDER BY DESC(?vc)

Secondly, looking at
http://docs.openlinksw.com/virtuoso/queryingftcols.html  my educated
guess was that I could replace this query with

SELECT ?val (COUNT(DISTINCT ?id) as ?vc)
WHERE
{
    ?id<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>  ?val ;
        ?property ?property_value .
        ?property_value bif:contains 'foo or bar' .
}
GROUP BY ?val
ORDER BY DESC(?vc)

and my hope was that this version would run a little faster (don't ask
me why... just a wild guess that I would try)

Unfortunately, this last version seems to give different query results.

My two questions:

1) Why is this last query giving different results?  Am I
misinterpreting something?

2) Is there a way I can rewrite the original query so that it runs faster?

Thanks!
Bart

One little issue here is that you don't indicate the role of named graphs i.e., do you want this query to be scoped to every named graph or to specific named graphs? As you can imagine, this has impact on the perceived performance.

You could use the public LOD Cloud instance to demonstrate your quest, and share a SPARQL query results URL for accelerated analysis (on our side).

[1] http://lod.openlinksw.com/sparql -- LOD Cloud Cache Instance (50 Billion+ Triples)

--

Regards,

Kingsley Idehen 
Founder & CEO
OpenLink Software
Company Web: http://www.openlinksw.com
Personal Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter Profile: https://twitter.com/kidehen
Google+ Profile: https://plus.google.com/+KingsleyIdehen/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen





Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

------------------------------------------------------------------------------
_______________________________________________
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users

Reply via email to