Dan

Is there any chance you could try grabbing a JVM thread dump next time you 
notice this happening?

There are multiple ways to do this depending on your environment, the simplest 
and most portable is just to send a SIGQUIT to the JVM process which will cause 
a thread dump to be output to standard out e.g.

> kill -QUIT <pid> > dump.txt

This then might give us some idea of what the JVM is doing at the time which 
combined with more detail from logging might prove more enlightening

Rob

On 15/06/2018, 03:48, "Dan Pritts" <[email protected]> wrote:

    
    > So the issue is that memory goes up, that is the heap expands to the 
    > maximum Xmx size set?  The JVM does not return any heap back to the OS 
    > (as far as I know) so if all the applications grow their heaps, the 
    > real RAM to match that or swapping may result.
    Hi Andy,
    
    thanks for taking the time to help.
    
    The problem is that the NON-HEAP memory usage skyrockets.
    
    I "allocate" memory for the heap.    The gc logs suggested that I was 
    never exceeding 6GB of heap in use, even when things went to hell.  So I 
    set the heap to 10GB.
    
    Now that I know we're using NIO, I "allocate" memory for NIO to hold the 
    entire index in ram.  the db is 2.4GB on disk.  I don't know NIO well 
    but this seems plausible.
    
    let's throw another gig at java for its own internal use.
    
    That would add up to 10 + 2.4 + 1 = 13.4GB of memory i might expect java 
    to use.  There's nothing else on the server except apache, linux, and a 
    few system daemons (postfix, etc).
    
    I upgraded to 3.7 and put fuseki on its own AWS instance last night. RAM 
    was 16GB and swap 10GB.
    
    once today it filled ram & swap such that linux whacked the jvm 
    process.  Two other times today it was swapping heavily (5GB or swap 
    used) and we restarted fuseki before the system  ran out of swap.
    
    For some reason, the JVM running fuseki+jetty is going nuts with its 
    memory usage.  It *is* using more heap than usual when this happens, but 
    it's not using more than the 10GB I allocated.   At least, not according 
    to the garbage collection logs.
    
    We have had this problem a few times in the past - memory usage would 
    spike drastically.  We'd always attributed it to a slow memory leak, and 
    decided we should restart fuseki regularly.  But in the last couple 
    weeks it's happened probably a dozen times.
    
    after the third time today, I put it on a 32GB instance.  Of course, the 
    problem hasn't happened since.
    
    > A couple of possibilities:
    >
    > 1/ A query does an ORDER BY that involves a large set of results to 
    > sort. This then drives up the heap requirement, the JVM gorws the heap 
    > and now the process is larger.  There may well be a CPU spike at this 
    > time.
    >
    > 2/ Updates are building up. The journal isn't flushed to the main 
    > database until there is a quiet moment and with the high query rate 
    > you may get bursts of time when it is not quiet.  The updates are safe 
    > in the journal (the commit happened) but also in-memory as an overlay 
    > on the database.  The overlays are collapsed when there are no readers 
    > or writers.
    >
    > What might be happening is that there isn't a quiet moment.
    The traffic is certainly steady - it was about 1500 hits/minute today 
    when we first crashed.
    > Big sudden jump would imply a big update as well.
    
    > Setting the log into INFO (and, yes, at load it does get big)
    >
    > What you are looking for is overlaps of query/updates so that the log 
    > shows true concurrent execution (i.e [1] starts, [2] starts, [1] 
    > finishes logged after [2] starts) around the time the size grows 
    > quickly and check the size of updates.
    I will look for this.  I am dubious, though.  We don't make many writes, 
    and those we do are not very big.  Our dataset is metadata about our 
    archive.  The archive is 50 years old, and grows steadily but slowly.
    
    we had disabled the fuseki log but left httpd logging enabled because 
    each was huge.  Unfortunately the updates were all in POSTs, which i 
    hadn't noticed until i went looking just now.  So I will have to wait 
    until next time.
    
    thanks
    danno
    
    
    




Reply via email to