Re: SPARQL performance question

Steve Vestal Sun, 23 Feb 2020 12:59:57 -0800

Attached is a graph of heap usage, peaked about 1.5G, well below the 4G
limit.  VisualVM wouldn't record running CPU usage, but it was a pretty
steady 35% throughout in the OS monitor.  Heap usage dropped noticeably
about 5 minutes into the run, but the time between bursts of 8 rows
stayed a pretty constant ~1 min.


The ~800 model size() statements was for the imports.  Total model
size() is ~1500, number of iterated statements from the OntModel is
~4700 (statements iterated just prior to doing the select query, and
perhaps entailed into the OntModel before the query).

All models are in memory.

Here is the query:

    PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
    PREFIX owl:<http://www.w3.org/2002/07/owl#>
    SELECT ?connectionAA ?connectionAB ?connectionBA ?connectionBB
    ?leftA ?leftB ?rightA ?rightB ?singleHardware
    WHERE {
    ?connectionAA rdf:type <http://www.somedomain.net/aadl#portConnection>.
    ?connectionAB rdf:type <http://www.somedomain.net/aadl#portConnection>.
    ?connectionBA rdf:type <http://www.somedomain.net/aadl#portConnection>.
    ?connectionBB rdf:type <http://www.somedomain.net/aadl#portConnection>.
    ?leftA rdf:type <http://www.somedomain.net/aadl#thread>.
    ?leftB rdf:type <http://www.somedomain.net/aadl#thread>.
    ?rightA rdf:type <http://www.somedomain.net/aadl#thread>.
    ?rightB rdf:type <http://www.somedomain.net/aadl#thread>.
    ?singleHardware rdf:type <http://www.somedomain.net/aadl#platform>.
    ?leftA <http://www.somedomain.net/aadl#simplexConnectTo> ?connectionAA.
    ?connectionAA <http://www.somedomain.net/aadl#simplexConnectTo> ?rightA.
    ?leftA <http://www.somedomain.net/aadl#simplexConnectTo> ?connectionAB.
    ?connectionAB <http://www.somedomain.net/aadl#simplexConnectTo> ?rightB.
    ?leftB <http://www.somedomain.net/aadl#simplexConnectTo> ?connectionBA.
    ?connectionBA <http://www.somedomain.net/aadl#simplexConnectTo> ?rightA.
    ?leftB <http://www.somedomain.net/aadl#simplexConnectTo> ?connectionBB.
    ?connectionBB <http://www.somedomain.net/aadl#simplexConnectTo> ?rightB.
    ?connectionAA
    <http://www.somedomain.net/ontology/fhowl/singlepointfailpattern#boundTo>
    ?singleHardware.
    ?connectionBA
    <http://www.somedomain.net/ontology/fhowl/singlepointfailpattern#boundTo>
    ?singleHardware.
    FILTER (?connectionAA!=?connectionAB && ?connectionAA!=?connectionBA
    && ?connectionAA!=?connectionBB && ?connectionAA!=?leftA &&
    ?connectionAA!=?leftB && ?connectionAA!=?rightA &&
    ?connectionAA!=?rightB && ?connectionAA!=?singleHardware
            && ?connectionAB!=?connectionBA &&
    ?connectionAB!=?connectionBB && ?connectionAB!=?leftA &&
    ?connectionAB!=?leftB && ?connectionAB!=?rightA &&
    ?connectionAB!=?rightB && ?connectionAB!=?singleHardware
            && ?connectionBA!=?connectionBB && ?connectionBA!=?leftA &&
    ?connectionBA!=?leftB && ?connectionBA!=?rightA &&
    ?connectionBA!=?rightB && ?connectionBA!=?singleHardware
            && ?connectionBB!=?leftA && ?connectionBB!=?leftB &&
    ?connectionBB!=?rightA && ?connectionBB!=?rightB &&
    ?connectionBB!=?singleHardware
            && ?leftA!=?leftB && ?leftA!=?rightA && ?leftA!=?rightB &&
    ?leftA!=?singleHardware
            && ?leftB!=?rightA && ?leftB!=?rightB && ?leftB!=?singleHardware
            && ?rightA!=?rightB && ?rightA!=?singleHardware
            && ?rightB!=?singleHardware)
    }


On 2/23/2020 1:54 PM, Marco Neumann wrote:
> a copy of the actual query + assertions would be advisable here in addition
> to runtime information (cpu,mem,os,jdk)
>
>
> On Sun 23. Feb 2020 at 18:57, Steve Vestal <[email protected]>
> wrote:
>
>> I'm looking for suggestions on a SPARQL performance issue.  My test
>> model has ~800 sentences, and processing of one select query takes about
>> 25 minutes.  The query is a basic graph pattern with 9 variables and 20
>> triples, plus a filter that forces distinct variables to have distinct
>> solutions using pair-wise not-equals constraints.  No option clause or
>> anything else fancy.
>>
>> I am issuing the query against an inference model.  Most of the asserted
>> sentences are in imported models.  If I iterate over all the statements
>> in the OntModel, I get ~1500 almost instantly.  I experimented with
>> several of the reasoners.
>>
>> Below is the basic control flow.  The thing I found curious is that the
>> execSelect() method finishes almost instantly.  It is the iteration over
>> the ResultSet that is taking all the time, it seems in the call to
>> selectResult.hasNext(). The result has 192 rows, 9 columns.  The results
>> are provided in bursts of 8 rows each, with ~1 minute between bursts.
>>
>>         OntModel ontologyModel = getMyOntModel(); // Tried various
>> reasoners
>>         String selectQuery = getMySelectQuery();
>>         QueryExecution selectExec =
>> QueryExecutionFactory.create(selectQuery, ontologyModel);
>>         ResultSet selectResult = selectExec.execSelect();
>>         while (selectResult.hasNext()) {  // Time seems to be spent in
>> hasNext
>>             QuerySolution selectSolution = selectResult.next();
>>             for (String var : getMyVariablesOfInterest() {
>>                 RDFNode varValue = selectSolution.get(var);
>>                 // process varValue
>>             }
>>         }
>>
>> Any suggestions would be appreciated.
>>
>> --
>
> ---
> Marco Neumann
> KONA
>

Re: SPARQL performance question

Reply via email to