On 22/12/14 02:03, Petr Baudis wrote:
   Hi!

On Sun, Dec 21, 2014 at 08:30:02PM +0000, Andy Seaborne wrote:
On 21/12/14 16:54, Petr Baudis wrote:
It works beautifully, but the system puts Fuseki under a pretty heavy
load, with several tens of SPARQL queries per second at times, often in
parallel.  And after about an hour on average, Fuseki just hangs up,
still accepting new queries but never generating a result.
..snip..

What is the setup in terms of hardware (RAM size, number of CPUs etc
etc), operating system and versions?  The details do matter here.

   This is:

        24GiB RAM
        8x AMD FX(tm)-8350 Eight-Core Processor
        Linux 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt2-1 (2014-12-08) x86_64 
GNU/Linux
        (Debian Wheezy, with some leaf Jessie packages mixed in)
        Fuseki 1.1.1 (binary distribution)
        Jena 2.12.1 (used for tdbloader)

The SPARQL endpoint is publicly available at

        http://pasky.or.cz:3030/dbpedia/query

(right now I just run fuseki in a loop and kill it every 10 minutes
as a stopgap measure).

This may be JENA-801 [1].

   Hmm. I see, that's interesting.  Just to clarify, though - what I'm
seeing is a hard hang, the Fuseki process is not consuming any CPU and
no queries are ever answered (at least in the order of hours).  It is
not simply a performance degradation, which I get the impression is
what JENA-801 is about.

   (Also, while I'm hitting Fuseki with a lot of queries, I believe there
should never be more than two connections + queries going on at the same
time.  The queries are pretty simple, typically take 2-3ms to service.)

   (Also, I really do just queries, no updates, I'm running in read-only
mode.)

   (Also, no messages like Java GC notifications are printed on Fuseki's
console in the event of this deadlock.)

   So this really seems quite different to what I'm reading there and in
JENA-689, JENA-703.

That's why I'd like to see what has been done to know if the update
chnages were also tried out.  If for your usage, it is just query
load, I can build a special for you to try out (or you can : replace
the body of CacheLRU with CacheGuava body, add dependency to ARQ and
build with maven).

   If in the light of the above you still think trying this out makes
sense, I will be happy to do that in the course of next few days.  If
building a special for me would be easy for you, I'd appreciate that,
but otherwise I can give it a try.


Certainly not JENA-689, JENA-703 which are update related so not obviously releveant here. JENA-801 is an issue about locking on the node table, and that synchronization happens in the read only situation as well, which is why I though it might be related.

If there a a few connections (<=2) and large numbers of small queries issued over each connection. Assuming there are no sorts and no timeouts set, then the execution of the query should be all on the thread that it came in on. And you 8 (shame it's not 8*8!) cores. Do you have couple of example queries you can share?

Does the CPU load increase to start with, then drops off? Fuseki/TDB is typically CPU-busy when the OS warms up and the working set index files is memory.

Maybe the first thing to try is to point jvisualvm (in the JDK) or some other monitoring tool at the Fuseki process and see if there is any evidence. The thread dump would be useful. (jconsole even has a "Detect Deadlock" which I have never used but the button label is suggestive)

        Andy

Reply via email to