Re: Jena query performance

Andy Seaborne Tue, 12 Jun 2012 02:56:47 -0700

On 12/06/12 08:29, Dave Reynolds wrote:

On 12/06/12 07:46, Jauhiainen Matti wrote:

I'm doing the queries over single inferred in memory model, which has
around half a million triplets. Written as RDF/XML it takes around 5
MB on disk. I run the queries on desktop with 4 GB of RAM and Core 2
Quad @ 2.66GHz. Am I missing something with the computational
complexity of the first two queries? What makes the second and third
query so different?


Without having any details of your data it is extremely likely that all
the slow down you are seeing is in the inference, not the query processing.

To test this try materializing an inference closure and then run your
queries on that closure [i.e. create a plain memory model and add() the
inferred model or, preferably, just those inferences you need].

Dave


On 12/06/12 07:46, Jauhiainen Matti wrote:
> Hi,
>

> I have performance issues with certain types of SPARQL queries overJena model. Things get slow when I try to query patterns with multiplerelations between resources, for example:

>
> DESCRIBE ?var1 ?var2 ?var3 WHERE {
>          ?var1 NS:type 'X' .
>          ?var2 NS:type 'Y' .
>          ?var3 NS:type 'Z' .

A three partial cross product.

>          ?var1 NS:dependency ?var2 .
>          ?var2 NS:dependency ?var3

Those last two can very expensive as well.

You may find reordering the pattern helps.

Try a SELECT query and see what happens - DESCRIBE is doing an implicitSELECT DISTINCT - try without DISTINCT and see how many cases the queryhas to consider.


> }
>
> or even just:
>
> DESCRIBE ?var1 ?var2 ?var3 WHERE {
>          ?var1 NS:type 'X' .
>          ?var2 NS:type 'Y' .
>          ?var3 NS:type 'Z' .

Suggests it's the unconnected types causing lost of work, possiblestressing the JVM.


> }
>

> These take longer to complete than I care to wait (over an hour atleast) while similar query will complete in seconds, e.g.

>
> DESCRIBE ?var1 ?var2 WHERE {
>          ?var1 NS:type 'X' .
>          ?var2 NS:type 'Y' .
>          ?var1 NS:dependency ?var2 .

How many ?var NS:type 'Z'?

Do you need to ask the NS:type at all?

> }
>

> I'm doing the queries over single inferred in memory model, which hasaround half a million triplets. Written as RDF/XML it takes around 5 MBon disk. I run the queries on desktop with 4 GB of RAM and Core 2 Quad @2.66GHz. Am I missing something with the computational complexity ofthe first two queries? What makes the second and third query so different?


Inference does not look to be the root cause but it's going to add a cost.

>
> Regards,
>
> Matti Jauhiainen
>
>

Re: Jena query performance

Reply via email to