On 19/09/12 18:48, Bill Roberts wrote:
Thanks Andy. Further info inline.
Depending on load, but I doubt at 20e6 it's going to take 10+mins. It's not a
cheap query but it's not completely bad. If GC went mad with close-to-full
thrashing, then maybe but if so, then the query will probably soon run out of
heap.
Has this server crashed and rebooted?
Not sure - I think we may have had to force a reboot at one point, without
stopping fuseki first, though as this instance doesn't get updated very often,
it's unlikely that this would have happened during a write.
Has data been incrementally added?
Yes
Any exceptions in the log when doing an update?
Not that I'm aware of.
It is possible that one of the old bugs has caused a loop in the index. The
query will never finish, but the amount of memory will not go up - it's in a
CPU loop.
Try this:
SELECT (count(*) AS ?count) WHERE { ?s ?p ?o }
and see if it finishes. Not a perfect test but indicative.
I tried this. It didn't finish after 6 minutes, so I killed it (restarted
fuseki).
Memory use was high but not rising while the query was running.
CPU was maxed out (on one core).
Incidentally, fuseki in this state was quite hard to kill - it wasn't
responding to stopping via /etc/init.d - I had to kill -9 the java process.
If the code is in a CPU loop, might that explain why the timeout doesn't work -
because the code never gets a chance to check whether it should give up on this
query?
Yes - that's what is probably happening.
(In any case I will upgrade that server to Fuseki 0.2.4 - this is a
good reminder that I should do that).
Yes!
Have upgraded to 0.2.4
Repeated the count-everything test, with same outcome.
If a looped index is the problem, presumably the only solution is to
reload all the data into a new TDB instance?
Yes - I'm afraid so.
0.2.5-SNAPSHOT is even more robust.
Andy
Thanks
Bill