#4322: High CPU usage during idle time due to GC
-------------------------------+--------------------------------------------
Reporter: hesselink | Owner:
Type: bug | Status: new
Priority: normal | Milestone:
Component: Runtime System | Version: 6.12.3
Keywords: idle GC | Testcase:
Blockedby: | Difficulty:
Os: MacOS X | Blocking:
Architecture: x86 | Failure: Runtime performance bug
-------------------------------+--------------------------------------------
Comment(by hesselink):
I compiled using -debug, and ran using +RTS -N -Ds. Here is the output I
get:
{{{
b0103000: cap 0: starting GC
all threads:
threads on capability 0:
threads on capability 1:
thread 32 @ 0x8c2d000 is not blocked (TSO_DIRTY)
other threads:
thread 29 @ 0x8c6246c is blocked on an MVar @ 0x7c9c744
thread 2 @ 0x8c2b000 is blocked on an external call
thread 3 @ 0x8da3000 is blocked on an MVar @ 0x7ca5840
thread 10 @ 0x7cb0af0 is blocked on an MVar @ 0x8c2f368
thread 11 @ 0x8c1a250 is blocked on an STM operation
thread 13 @ 0x8c63000 is blocked on an STM operation
b0185000: cap 1: starting GC
a039a720: woken up on capability 1
a039a720: capability 1 is owned by another task
b0185000: cap 1: finished GC
b0185000: cap 1: running thread 32 (ThreadRunGHC)
b0185000: cap 1: thread 32 stopped (yielding)
b0185000: cap 1: thread 32 appended to run queue
b0185000: cap 1: running thread 32 (ThreadRunGHC)
b0185000: cap 1: thread 32 stopped (heap overflow)
b0185000: cap 1: requesting parallel GC
b0185000: ready_to_gc, grabbing GC threads
b0103000: cap 0: finished GC
}}}
This repeats continuously. Sometimes, after `ready_to_gc`, there are 10-20
lines saying `b0185000: passing capability 0 to worker 0xb0103000`. I've
attached a full log to this ticket which shows a slightly different run.
There is a lot of noise during startup. The server is running at line
25194. At 26162, I do a simple request; this triggers the CPU usage.
Before that, CPU usage is 0%, after that, 9%. At 28362, I shut down the
server again.
As for the three points: we have finalizers that free/destroy C/C++
structures, nothing else. We don't catch deadlock exceptions anywhere. I'm
not sure about the timers; I don't see anything in the -Ds logs, but I
don't really know how to read them.
One thing I forgot to mention earlier: the cache we use is some Data.Map's
and Data.Set's in TVar's. When we 'deepseq' these before storing them, the
CPU usage drops to 2-10%, which is still high, but lower than before. I'm
guessing this is just the GC having less indirections to traverse, but
perhaps it is useful info.
Does this shed any light on anything?
--
Ticket URL: <http://hackage.haskell.org/trac/ghc/ticket/4322#comment:2>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
_______________________________________________
Glasgow-haskell-bugs mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs