[Virtuoso-users] Complex SPARQL query - Performance hints?

Pantelis Natsiavas Wed, 28 Sep 2016 05:46:08 -0700

Hi everybody.

I have a rather complex SPARQL query, which is executed thousands of times
in parallel threads (400 threads). The query is here somewhat simplified
(namespaces, properties and variables have been reduced) for readability,
but the complexity is left untouched (unions, number of graphs etc.). The
query is run against 4 graphs, the biggest of which contains 5561181
triples.


PREFIX graphA: <GraphABaseURI:>

ASK
FROM NAMED <GraphBURI>
FROM NAMED <GraphCURI>
FROM NAMED <GraphABaseURI>
FROM NAMED <GraphDBaseURI>
WHERE{
   {
      GRAPH <GraphABaseURI>{
         ?variableA a graphA:ClassA .
         ?variableA graphA:propertyA ?variableB .
         ?variableB dcterms:title ?variableC .
         ?variableA graphA:propertyB ?variableD .
         ?variableL<GraphABaseURI:propertyB> ?variableD .
         ?variableD <propertyBURI> ?variableE
      }
      .
      GRAPH <GraphBURI>{
        ?variableF <propertyCURI>/<propertyDURI> ?variableG .
        ?variableF <propertyEURI> ?variableH
      }
      .
      GRAPH <GraphCURI>{
        ?variableI <http://www.w3.org/2004/02/skos/core#notation>
?variableJ .
        ?variableI <http://www.w3.org/2004/02/skos/core#prefLabel>
?variableK .
        FILTER (isLiteral(?variableK) && REGEX(?variableK, "literalA", "i"))
      }
      .
      FILTER (isLiteral(?variableJ) && ?variableG = ?variableJ) .
      FILTER (?variableE = ?variableH)
   }
   UNION
   {
       GRAPH <GraphABaseURI>{
          ?variableA a graphA:ClassA .
          ?variableA graphA:propertyA ?variableB .
          ?variableB dcterms:title ?variableC .
          ?variableA graphA:propertyB ?variableD .
          ?variableL<propertyBURI> ?variableE .
          ?variableL <propertyFURI> ?variableD .
       }
       .
       GRAPH <GraphDBaseURI>{
          ?variableM <propertyGURI> ?variableN .
          ?variableM <propertyHURI> ?variableO .
          FILTER (isLiteral(?variableO) && REGEX(?variableO, "literalA",
"i"))
       }
       .
       FILTER (?variableE = ?variableN) .

   }
   UNION
   {
       GRAPH <GraphABaseURI>{
          ?variableA a graphA:ClassA .
          ?variableA graphA:propertyA ?variableB .
          ?variableB dcterms:title ?variableC .
          ?variableA graphA:propertyB ?variableD .
          ?variableL<propertyBURI> ?variableE .
          ?variableL <propertyIURI> ?variableD .
       }
       .
       GRAPH <GraphDBaseURI>{
          ?variableM <propertyGURI> ?variableN .
          ?variableM <propertyHURI> ?variableO .
          FILTER (isLiteral(?variableO) && REGEX(?variableO, "literalA",
"i"))
       }
       .
       FILTER (?variableE = ?variableN) .
   }
   . FILTER (isLiteral(?variableC) && REGEX(?variableC, "literalB", "i")) .

}


I would not expect someone to transform the above query (of course...). I
am only posting the query to demonstrate the complexity and all the SPARQL
structures used.

My questions:

1. Would I gain regarding performance if I had all my triples in one graph?
This way I would avoid unions and simplify my query, however, would this
also benefit in terms of performance?
2. Are there any kind of indexes that I could built and they could be of
any help with the above query? I am not really confident on data indexing,
however reading in
http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VirtRDFPerformanceTuning#RDF
Index Scheme I wonder if the virtuoso 7's default indexing scheme is
suitable for queries like the above. While the predicates are defined in
the above query's SPARQL triple patterns, there are many triple patterns
that have not defined subject or predicate. Could this be a major problem
regarding performance?
3. Perhaps there is a SPARQL syntax structure that I am not aware of and
could be of great help in the above query. Could you suggest something? For
example, I have already improved performance by removing STR() casts and
using the isLiteral() function. Could you suggest anything else?
4. Perhaps you could suggest overusing a complex SPARQL syntax structure?

Please note that I use Virtuoso Open source edition, built on Ubuntu,
Version: 07.20.3214, Build: Oct 14 2015.

Regards,
Pantelis Natsiavas

------------------------------------------------------------------------------

_______________________________________________
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users

[Virtuoso-users] Complex SPARQL query - Performance hints?

Reply via email to