On 12/06/12 08:29, Dave Reynolds wrote:
On 12/06/12 07:46, Jauhiainen Matti wrote:

I'm doing the queries over single inferred in memory model, which has
around half a million triplets. Written as RDF/XML it takes around 5
MB on disk. I run the queries on desktop with 4 GB of RAM and Core 2
Quad @ 2.66GHz. Am I missing something with the computational
complexity of the first two queries? What makes the second and third
query so different?

Without having any details of your data it is extremely likely that all
the slow down you are seeing is in the inference, not the query processing.

To test this try materializing an inference closure and then run your
queries on that closure [i.e. create a plain memory model and add() the
inferred model or, preferably, just those inferences you need].

Dave

On 12/06/12 07:46, Jauhiainen Matti wrote:
> Hi,
>
> I have performance issues with certain types of SPARQL queries over Jena model. Things get slow when I try to query patterns with multiple relations between resources, for example:
>
> DESCRIBE ?var1 ?var2 ?var3 WHERE {
>          ?var1 NS:type 'X' .
>          ?var2 NS:type 'Y' .
>          ?var3 NS:type 'Z' .

A three partial cross product.

>          ?var1 NS:dependency ?var2 .
>          ?var2 NS:dependency ?var3

Those last two can very expensive as well.

You may find reordering the pattern helps.

Try a SELECT query and see what happens - DESCRIBE is doing an implicit SELECT DISTINCT - try without DISTINCT and see how many cases the query has to consider.

> }
>
> or even just:
>
> DESCRIBE ?var1 ?var2 ?var3 WHERE {
>          ?var1 NS:type 'X' .
>          ?var2 NS:type 'Y' .
>          ?var3 NS:type 'Z' .

Suggests it's the unconnected types causing lost of work, possible stressing the JVM.

> }
>
> These take longer to complete than I care to wait (over an hour at least) while similar query will complete in seconds, e.g.
>
> DESCRIBE ?var1 ?var2 WHERE {
>          ?var1 NS:type 'X' .
>          ?var2 NS:type 'Y' .
>          ?var1 NS:dependency ?var2 .

How many ?var NS:type 'Z'?

Do you need to ask the NS:type at all?

> }
>
> I'm doing the queries over single inferred in memory model, which has around half a million triplets. Written as RDF/XML it takes around 5 MB on disk. I run the queries on desktop with 4 GB of RAM and Core 2 Quad @ 2.66GHz. Am I missing something with the computational complexity of the first two queries? What makes the second and third query so different?

Inference does not look to be the root cause but it's going to add a cost.

>
> Regards,
>
> Matti Jauhiainen
>
>

Reply via email to