On 18/09/12 15:20, Bill Roberts wrote:
Someone ran this query against one of our servers recently:
SELECT DISTINCT ?class ?property WHERE { { ?s ?property ?o }.
OPTIONAL { ?s rdf:type ?class } } LIMIT 50 OFFSET 0
(then ran it again for good measure when the first one didn't
finish).
The server is running Fuseki (0.2.2) with a timeout set: ja:cxtValue
"11000,20000". My question is: does anyone know why the Fuseki
timeout didn't kill this query? It was still running 10 mins later
and causing slow responses on the server.
The data that it was running against would tend to make this query
take a long time - as the first data in the indexes is 20 million or
so triples all of the same class, so it would find its first result
quickly, but would have a lot of work to do to get to 50 different
classes.
Depending on load, but I doubt at 20e6 it's going to take 10+mins. It's
not a cheap query but it's not completely bad. If GC went mad with
close-to-full thrashing, then maybe but if so, then the query will
probably soon run out of heap.
Has this server crashed and rebooted?
Has data been incrementally added?
Any exceptions in the log when doing an update?
It is possible that one of the old bugs has caused a loop in the index.
The query will never finish, but the amount of memory will not go up -
it's in a CPU loop.
Try this:
SELECT (count(*) AS ?count) WHERE { ?s ?p ?o }
and see if it finishes. Not a perfect test but indicative.
(In any case I will upgrade that server to Fuseki 0.2.4 - this is a
good reminder that I should do that).
Yes!
Thanks
Bill