On 22/12/14 02:03, Petr Baudis wrote:
Hi!
On Sun, Dec 21, 2014 at 08:30:02PM +0000, Andy Seaborne wrote:
On 21/12/14 16:54, Petr Baudis wrote:
It works beautifully, but the system puts Fuseki under a pretty heavy
load, with several tens of SPARQL queries per second at times, often in
parallel. And after about an hour on average, Fuseki just hangs up,
still accepting new queries but never generating a result.
..snip..
What is the setup in terms of hardware (RAM size, number of CPUs etc
etc), operating system and versions? The details do matter here.
This is:
24GiB RAM
8x AMD FX(tm)-8350 Eight-Core Processor
Linux 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt2-1 (2014-12-08) x86_64
GNU/Linux
(Debian Wheezy, with some leaf Jessie packages mixed in)
Fuseki 1.1.1 (binary distribution)
Jena 2.12.1 (used for tdbloader)
The SPARQL endpoint is publicly available at
http://pasky.or.cz:3030/dbpedia/query
(right now I just run fuseki in a loop and kill it every 10 minutes
as a stopgap measure).
This may be JENA-801 [1].
Hmm. I see, that's interesting. Just to clarify, though - what I'm
seeing is a hard hang, the Fuseki process is not consuming any CPU and
no queries are ever answered (at least in the order of hours). It is
not simply a performance degradation, which I get the impression is
what JENA-801 is about.
(Also, while I'm hitting Fuseki with a lot of queries, I believe there
should never be more than two connections + queries going on at the same
time. The queries are pretty simple, typically take 2-3ms to service.)
(Also, I really do just queries, no updates, I'm running in read-only
mode.)
(Also, no messages like Java GC notifications are printed on Fuseki's
console in the event of this deadlock.)
So this really seems quite different to what I'm reading there and in
JENA-689, JENA-703.
That's why I'd like to see what has been done to know if the update
chnages were also tried out. If for your usage, it is just query
load, I can build a special for you to try out (or you can : replace
the body of CacheLRU with CacheGuava body, add dependency to ARQ and
build with maven).
If in the light of the above you still think trying this out makes
sense, I will be happy to do that in the course of next few days. If
building a special for me would be easy for you, I'd appreciate that,
but otherwise I can give it a try.
Certainly not JENA-689, JENA-703 which are update related so not
obviously releveant here. JENA-801 is an issue about locking on the node
table, and that synchronization happens in the read only situation as
well, which is why I though it might be related.
If there a a few connections (<=2) and large numbers of small queries
issued over each connection. Assuming there are no sorts and no timeouts
set, then the execution of the query should be all on the thread that it
came in on. And you 8 (shame it's not 8*8!) cores. Do you have couple
of example queries you can share?
Does the CPU load increase to start with, then drops off? Fuseki/TDB is
typically CPU-busy when the OS warms up and the working set index files
is memory.
Maybe the first thing to try is to point jvisualvm (in the JDK) or some
other monitoring tool at the Fuseki process and see if there is any
evidence. The thread dump would be useful. (jconsole even has a "Detect
Deadlock" which I have never used but the button label is suggestive)
Andy