Re: Fuseki query performance

Jérôme Tue, 06 Dec 2011 07:45:01 -0800

Thank you Andy,

it was the cost of serializing and deserializing.


My second problem (yes, i have another one ;-) ) is:

The goal of my queries is to find "paragraphs" which are containing"words" which are matching a regex.

My triplestore stores approximately 1.600.000 triples.

For example: find paragraphs (in my RDF model) containing the word"example" - here the corresponding query:


PREFIX ram:<...>
PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#>

SELECT ?Response
WHERE
{
?Response rdf:type <http://www.tei-c.org/ns/1.0#p> .
?Objet_1 rdf:type <http://prodescartes.greyc.fr/annotations#word> .
?Objet_1 ram:contents ?Objet_1_content .
FILTER regex(?Objet_1_content,"example") .
?Response ram:contains ?Objet_1 .
}

I get the result in 0.5 seconds

Now, when i'm looking for paragrahs containing "example" and "help":

SELECT ?Response
WHERE
{

?Response rdf:type <http://www.tei-c.org/ns/1.0#p> .

?Objet_1 rdf:type <http://example.com#word> .
?Objet_1 ram:contents ?Objet_1_content .
FILTER regex(?Objet_1_content,"example") .
?Response ram:contains ?Objet_1 .

?Objet_2 rdf:type <http://example.com#word> .
?Objet_2 ram:contents ?Objet_2_content .
FILTER regex(?Objet_2_content,"help") .
?Response ram:contains ?Objet_2 .

}

I get the result in...10 minutes. ResultSet is around 50 results.

Why is it so long?

The "funniest" is when i remove constraints on words:
I remove those 2 lines:
?Objet_1 rdf:type <http://example.com#word> .
?Objet_2 rdf:type <http://example.com#word> .

Fuseki answers me faster...

Thank you.
Jérôme


Le 06/12/11 13:33, Andy Seaborne a écrit :

Jérôme,

There are 150K results?

There are streamed back (unlike Joseki) but it will take a while.

Which result format are you getting? You might try one of the otherresult formats which might be a bit faster.

It looks like it is simply the cost of serializing and deserialialingthe results. Unlike the second "count(??)" query, the first query hashave to access the node table to get the URi/bnode labels for everyresult.


    Andy

On 06/12/11 10:08, Jérôme wrote:

Hi,

I'm trying to query my TDB store and I have some performance problems:
Here a simple query example:

PREFIX test:<...>
PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT ?r
{
?r rdf:type test:word .
}

I have to wait around 20 seconds to get a result - how can i optimizeit?


The "count" query
PREFIX test:<...>
PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT count(?r)
{
?r rdf:type test:word .
}

returns 150 000.

My fuseki server is running with a -Xmx1024m parameter.

Thank you.
Jérôme.

My config file:
<#service1> rdf:type fuseki:Service ;
fuseki:name "test" ; # http://host:port/ds
fuseki:serviceQuery "query" ; # SPARQL query service
fuseki:serviceQuery "sparql" ; # SPARQL query service
fuseki:serviceUpdate "update" ; # SPARQL query service
fuseki:serviceUpload "upload" ; # Non-SPARQL upload service
fuseki:serviceReadWriteGraphStore "data" ; # SPARQL Graph store protocol
(read and write)
# A separate ead-only graph store endpoint:
fuseki:serviceReadGraphStore "get" ; # SPARQL Graph store protocol (read
only)
fuseki:dataset <#test> ;
.


<#test> rdf:type ja:RDFDataset ;
rdfs:label "Books" ;
ja:defaultGraph
[ rdfs:label "discours.rdf" ;
a ja:MemoryModel ;
ja:content [ja:externalContent <file:Data/discours.rdf> ] ;
] ;
.

Re: Fuseki query performance

Reply via email to