Re: Slow SPARQL query

Andy Seaborne Tue, 16 Aug 2016 03:13:50 -0700

On 15/08/16 09:47, Mikael Pesonen wrote:


Hi,

what do you mean by masking? It should remove duplicates and it makes
the query run in half time compared to without DISTINCT. Result count at
least is the same.

Mikael

If DISTINCT cause a lot of results to be turned into a few, it is hidinga lot of work by the query engine.

If it's the inner DISTINCT that halves the execution time, then theimprovements (in dev builds) to property* may help you.

If it's the outer one, it's a serialization issue (which I doubt at thissacale).


        Andy



On 12.8.2016 13:53, Andy Seaborne wrote:

On 08/08/16 11:56, Mikael Pesonen wrote:


Hi Andy,

storage is started like this:

/usr/bin/java -Xmx3600M -jar
/home/text/tools/apache-jena-fuseki-2.3.1/fuseki-server.jar --update
--port 3030 --loc=../apache-jena-3.0.1/DB /ds

Ontology data is simple SKOS, and document data is also simple DC
metadata triplets. Query returns ~15k triplets.

I tested the SKOS part, and this executed in less than one second,
returning ~50 items:


How many without the two DISTINCT?

I am wondering if the DISTINCT (the inner one) is masking a lot of
results.


SELECT DISTINCT *
WHERE {
    GRAPH ?graph {
        SELECT DISTINCT ?child WHERE {
{<http://www.lingsoft.fi/ontologies/VerohallintoAsiakaskirjeet/c16e9937a515bda6>

skos:narrower* ?child}
            UNION
{<http://www.lingsoft.fi/ontologies/VerohallintoAsiakaskirjeet/e56f6309f0d86b95>

skos:narrower* ?child}
            UNION
{<http://www.lingsoft.fi/ontologies/VerohallintoAsiakaskirjeet/b393055ac0f3a0bc>

skos:narrower* ?child}
            UNION
{<http://www.lingsoft.fi/ontologies/VerohallintoAsiakaskirjeet/642194686a67f935>

skos:narrower* ?child}
            UNION
{<http://www.lingsoft.fi/ontologies/VerohallintoAsiakaskirjeet/a9beeb4bf0b0af70>

skos:narrower* ?child}
            UNION
{<http://www.lingsoft.fi/ontologies/VerohallintoAsiakaskirjeet/ce3598292f301cec>

skos:narrower* ?child}
            UNION
{<http://www.lingsoft.fi/ontologies/VerohallintoAsiakaskirjeet/26aa300e4c033981>

skos:narrower* ?child}
            UNION
{<http://www.lingsoft.fi/ontologies/VerohallintoAsiakaskirjeet/bd07d765f36ea88f>

skos:narrower* ?child}
            UNION
{<http://www.lingsoft.fi/ontologies/VerohallintoAsiakaskirjeet/bcf9e082e2ae8c9b>

skos:narrower* ?child}
            UNION
{<http://www.lingsoft.fi/ontologies/VerohallintoAsiakaskirjeet/78d3955357a8ac10>

skos:narrower* ?child}
            UNION
{<http://www.lingsoft.fi/ontologies/VerohallintoAsiakaskirjeet/369b1a9c822f55db>

skos:narrower* ?child}
            UNION
{<http://www.lingsoft.fi/ontologies/VerohallintoAsiakaskirjeet/7098a84669b9feca>

skos:narrower* ?child}
            UNION
{<http://www.lingsoft.fi/ontologies/VerohallintoAsiakaskirjeet/b7cb30c4efed996a>

skos:narrower* ?child}
        }
    }
}

Br,
Mikael


On 8.8.2016 13:43, Andy Seaborne wrote:

There is a a certain amount of "it depend" here: what's the data
stored in? what shape is the data?( which Jena version?)

In the next release, and available in development builds is:

https://issues.apache.org/jira/browse/JENA-1195

where property* got speeded up recently.  Usually, it took moderately
unusual data to show this up but the repeated use of an expensive
operation in property* may be happening here too.

Mikael - are you able to try out a SNAPSHOT build?

    Andy


On 08/08/16 11:37, Håvard Ottestad wrote:

Is this any better?

SELECT DISTINCT ?s ?p ?o WHERE {

  GRAPH <http://www.lingsoft.fi/resource-meta/> {
   ?s <http://purl.org/dc/terms/isPartOf>
<http://www.lingsoft.fi/rdf/uid/574ef1a40236a> .
    ?s <http://purl.org/dc/terms/subject> ?child .
   ?s ?p ?o
 }

  GRAPH <http://www.lingsoft.fi/> {
    SELECT DISTINCT ?child WHERE {
           {<http://www.lingsoft.fi/c16e9937a515bda6> skos:narrower*
?child} UNION {<http://www.lingsoft.fi/e56f6309f0d86b95>
skos:narrower* ?child} UNION
{<http://www.lingsoft.fi/b393055ac0f3a0bc> skos:narrower* ?child}
UNION {<http://www.lingsoft.fi/642194686a67f935> skos:narrower*
?child} UNION {<http://www.lingsoft.fi/a9beeb4bf0b0af70>
skos:narrower* ?child} UNION
{<http://www.lingsoft.fi/ce3598292f301cec> skos:narrower* ?child}
UNION {<http://www.lingsoft.fi/26aa300e4c033981> skos:narrower*
?child} UNION {<http://www.lingsoft.fi/bd07d765f36ea88f>
skos:narrower* ?child} UNION
{<http://www.lingsoft.fi/bcf9e082e2ae8c9b> skos:narrower* ?child}
UNION {<http://www.lingsoft.fi/78d3955357a8ac10> skos:narrower*
?child} UNION {<http://www.lingsoft.fi/369b1a9c822f55db>
skos:narrower* ?child} UNION
{<http://www.lingsoft.fi/7098a84669b9feca> skos:narrower* ?child}
UNION {<http://www.lingsoft.fi/b7cb30c4efed996a> skos:narrower*
?child}
     }
  }

}

Regards,
Håvard M. Ottestad

On 08 Aug 2016, at 11:25, Mikael Pesonen
<[email protected]> wrote:


Hi,

I'm not if this is the correct forum to ask but hope you can help.
This query takes over 20 seconds with jena:

SELECT DISTINCT ?s ?p ?o WHERE { GRAPH <http://www.lingsoft.fi/> {
SELECT DISTINCT ?child WHERE {
{<http://www.lingsoft.fi/c16e9937a515bda6> skos:narrower* ?child}
UNION {<http://www.lingsoft.fi/e56f6309f0d86b95> skos:narrower*
?child} UNION {<http://www.lingsoft.fi/b393055ac0f3a0bc>
skos:narrower* ?child} UNION
{<http://www.lingsoft.fi/642194686a67f935> skos:narrower* ?child}
UNION {<http://www.lingsoft.fi/a9beeb4bf0b0af70> skos:narrower*
?child} UNION {<http://www.lingsoft.fi/ce3598292f301cec>
skos:narrower* ?child} UNION
{<http://www.lingsoft.fi/26aa300e4c033981> skos:narrower* ?child}
UNION {<http://www.lingsoft.fi/bd07d765f36ea88f> skos:narrower*
?child} UNION {<http://www.lingsoft.fi/bcf9e082e2ae8c9b>
skos:narrower* ?child} UNION
{<http://www.lingsoft.fi/78d3955357a8ac10> skos:narrower* ?child}
UNION {<http://www.lingsoft.fi/369b1a9c822f55db> skos:narrower*
?child} UNION {<http://www.lingsoft.fi/7098a84669b9feca>
skos:narrower* ?child} UNION
{<http://www.lingsoft.fi/b7cb30c4efed996a> skos:narrower* ?child} }
} GRAPH <http://www.lingsoft.fi/resource-meta/> { ?s
<http://purl.org/dc/terms/subject> ?child . ?s
<http://purl.org/dc/terms/isPartOf>
<http://www.lingsoft.fi/rdf/uid/574ef1a40236a> . ?s ?p ?o } }First
graph query is for getting keywords from an ontology graph, second
is for querying documents having those keywords. Is there better
way/order to make this query? Thank you for the help, Mikael

--
www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's
and Writer's Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: [email protected]
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Linnankatu 10 A
FI-20100 Turku
FINLAND

Re: Slow SPARQL query

Reply via email to