Re: [GHC] #5897: GHC runtime task workers are not released with C FFI
#5897: GHC runtime task workers are not released with C FFI ---+ Reporter: sanketr | Owner: Type: bug | Status: closed Priority: highest | Milestone: 7.4.3 Component: Runtime System|Version: 7.4.1 Resolution: fixed | Keywords: worker, ffi Os: Unknown/Multiple | Architecture: x86_64 (amd64) Failure: None/Unknown | Difficulty: Unknown Testcase:| Blockedby: Blocking:|Related: #4262 ---+ Changes (by pcapriotti): * status: merge = closed * resolution: = fixed Comment: Merged as d677a952d666e5e7144e60524efb6989dddeb383 (plus testsuite fix e8fae135f7b7820f7dc213182903e8e4fb5170b6). -- Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/5897#comment:17 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler ___ Glasgow-haskell-bugs mailing list Glasgow-haskell-bugs@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs
Re: [GHC] #5897: GHC runtime task workers are not released with C FFI
#5897: GHC runtime task workers are not released with C FFI ---+ Reporter: sanketr | Owner: Type: bug | Status: new Priority: highest | Milestone: 7.4.2 Component: Runtime System|Version: 7.4.1 Resolution:| Keywords: worker, ffi Os: Unknown/Multiple | Architecture: x86_64 (amd64) Failure: None/Unknown | Difficulty: Unknown Testcase:| Blockedby: Blocking:|Related: #4262 ---+ Changes (by sanketr): * status: closed = new * resolution: fixed = Comment: Simon, looks like this fix didn't make it in 7.4.2. The number of task workers keep increasing, and the memory leak is still there. -- Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/5897#comment:14 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler ___ Glasgow-haskell-bugs mailing list Glasgow-haskell-bugs@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs
Re: [GHC] #5897: GHC runtime task workers are not released with C FFI
#5897: GHC runtime task workers are not released with C FFI ---+ Reporter: sanketr | Owner: Type: bug | Status: merge Priority: highest | Milestone: 7.4.2 Component: Runtime System|Version: 7.4.1 Resolution:| Keywords: worker, ffi Os: Unknown/Multiple | Architecture: x86_64 (amd64) Failure: None/Unknown | Difficulty: Unknown Testcase:| Blockedby: Blocking:|Related: #4262 ---+ Changes (by simonmar): * status: new = merge Comment: Sorry, it looks like that was an oversight. We'll merge it into the branch for 7.4.3 (if there is one). -- Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/5897#comment:15 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler ___ Glasgow-haskell-bugs mailing list Glasgow-haskell-bugs@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs
Re: [GHC] #5897: GHC runtime task workers are not released with C FFI
#5897: GHC runtime task workers are not released with C FFI ---+ Reporter: sanketr | Owner: Type: bug | Status: merge Priority: highest | Milestone: 7.4.3 Component: Runtime System|Version: 7.4.1 Resolution:| Keywords: worker, ffi Os: Unknown/Multiple | Architecture: x86_64 (amd64) Failure: None/Unknown | Difficulty: Unknown Testcase:| Blockedby: Blocking:|Related: #4262 ---+ Changes (by simonmar): * milestone: 7.4.2 = 7.4.3 -- Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/5897#comment:16 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler ___ Glasgow-haskell-bugs mailing list Glasgow-haskell-bugs@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs
Re: [GHC] #5897: GHC runtime task workers are not released with C FFI
#5897: GHC runtime task workers are not released with C FFI ---+ Reporter: sanketr | Owner: Type: bug | Status: new Priority: highest | Milestone: 7.4.2 Component: Runtime System|Version: 7.4.1 Resolution:| Keywords: worker, ffi Os: Unknown/Multiple | Architecture: x86_64 (amd64) Failure: None/Unknown | Difficulty: Unknown Testcase:| Blockedby: Blocking:|Related: #4262 ---+ Comment(by marlowsd@…): commit 085c7fe5d4ea6e7b59f944d46ecfeba3755a315b {{{ Author: Simon Marlow marlo...@gmail.com Date: Fri Mar 2 10:53:34 2012 + Drop the per-task timing stats, give a summary only (#5897) We were keeping around the Task struct (216 bytes) for every worker we ever created, even though we only keep a maximum of 6 workers per Capability. These Task structs accumulate and cause a space leak in programs that do lots of safe FFI calls; this patch frees the Task struct as soon as a worker exits. One reason we were keeping the Task structs around is because we print out per-Task timing stats in +RTS -s, but that isn't terribly useful. What is sometimes useful is knowing how *many* Tasks there were. So now I'm printing a single-line summary, this is for the program in TASKS: 2001 (1 bound, 31 peak workers (2000 total), using -N1) So although we created 2k tasks overall, there were only 31 workers active at any one time (which is exactly what we expect: the program makes 30 safe FFI calls concurrently). This also gives an indication of how many capabilities were being used, which is handy if you use +RTS -N without an explicit number. rts/RtsMain.c |3 +- rts/Stats.c | 42 +--- rts/Task.c| 100 ++--- rts/Task.h| 30 +++--- rts/posix/OSThreads.c |1 - rts/sm/Compact.c |2 +- 6 files changed, 73 insertions(+), 105 deletions(-) }}} -- Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/5897#comment:11 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler ___ Glasgow-haskell-bugs mailing list Glasgow-haskell-bugs@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs
Re: [GHC] #5897: GHC runtime task workers are not released with C FFI
#5897: GHC runtime task workers are not released with C FFI ---+ Reporter: sanketr | Owner: Type: bug | Status: closed Priority: highest | Milestone: 7.4.2 Component: Runtime System|Version: 7.4.1 Resolution: fixed | Keywords: worker, ffi Os: Unknown/Multiple | Architecture: x86_64 (amd64) Failure: None/Unknown | Difficulty: Unknown Testcase:| Blockedby: Blocking:|Related: #4262 ---+ Changes (by simonmar): * status: new = closed * resolution: = fixed Comment: I looked into your example again. Although the RTS was not keeping all the OS threads around, it ''was'' retaining information about each OS thread that had been used as a worker in the past, and this was being printed in the output of `+RTS -s`. Since keeping around all this information could constitute a space leak, I've dropped it (see the commit above). The example program above used to leak space over time, now it runs in constant space. The per-Task stats weren't particularly useful anyway, even for me. -- Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/5897#comment:12 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler ___ Glasgow-haskell-bugs mailing list Glasgow-haskell-bugs@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs
Re: [GHC] #5897: GHC runtime task workers are not released with C FFI
#5897: GHC runtime task workers are not released with C FFI ---+ Reporter: sanketr | Owner: Type: bug | Status: closed Priority: highest | Milestone: 7.4.2 Component: Runtime System|Version: 7.4.1 Resolution: fixed | Keywords: worker, ffi Os: Unknown/Multiple | Architecture: x86_64 (amd64) Failure: None/Unknown | Difficulty: Unknown Testcase:| Blockedby: Blocking:|Related: #4262 ---+ Comment(by sanketr): Replying to [comment:12 simonmar]: I looked into your example again. Although the RTS was not keeping all the OS threads around, it ''was'' retaining information about each OS thread that had been used as a worker in the past, and this was being printed in the output of `+RTS -s`. Since keeping around all this information could constitute a space leak, I've dropped it (see the commit above). The example program above used to leak space over time, now it runs in constant space. The per-Task stats weren't particularly useful anyway, even for me. Terrific! The per-task stats now look much more useful after this fix. -- Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/5897#comment:13 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler ___ Glasgow-haskell-bugs mailing list Glasgow-haskell-bugs@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs
Re: [GHC] #5897: GHC runtime task workers are not released with C FFI
#5897: GHC runtime task workers are not released with C FFI ---+ Reporter: sanketr | Owner: Type: bug | Status: new Priority: highest | Milestone: 7.4.2 Component: Runtime System|Version: 7.4.1 Resolution:| Keywords: worker, ffi Os: Unknown/Multiple | Architecture: x86_64 (amd64) Failure: None/Unknown | Difficulty: Unknown Testcase:| Blockedby: Blocking:|Related: #4262 ---+ Changes (by sanketr): * owner: simonmar = * status: closed = new * resolution: invalid = Comment: Simon, thanks for the catch. I forgot that I was getting pointer from a vector without retaining a reference to the vector (by passing it to say, long-lived timer function). How do I determine that the increasing number of threads is not an issue, and that they will be gc-ed eventually (if RTS output is just showing worker threads that are not alive anymore)? When I run the above fixed code on Linux for a few seconds, I get RTS output like below: {{{ ... head snipped for bug report here because too long . Task 93100 (worker) :0.00s( 0.00s) 0.00s( 0.00s) Task 93101 (worker) :0.00s( 0.00s) 0.00s( 0.00s) Task 93102 (worker) :0.00s( 0.08s) 0.00s( 0.00s) Task 93103 (worker) :0.00s( 0.00s) 0.00s( 0.00s) Task 93104 (worker) :0.00s( 0.14s) 0.00s( 0.00s) Task 93105 (worker) :0.00s( 0.12s) 0.00s( 0.00s) Task 93106 (worker) :0.00s( 0.03s) 0.00s( 0.00s) Task 93107 (worker) :0.00s( 0.20s) 0.00s( 0.00s) Task 93108 (worker) :0.03s( 5.10s) 0.00s( 0.00s) Task 93109 (bound) :0.00s( 0.00s) 0.03s( 0.03s) }}} It keeps increasing without any bounds for the short time I run it. If you look at the code, sendSignal is kicked off in FFI, but then it returns, and timerevent resumes execution. So, those worker threads should be freed (eventually). What I want to make sure is that the above observation can be safely ignored, and will like to know how to determine it. I am re-opening the bug for task worker resolution. Please close it after commenting on my observation - I hope there is no bug here. -- Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/5897#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler ___ Glasgow-haskell-bugs mailing list Glasgow-haskell-bugs@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs
Re: [GHC] #5897: GHC runtime task workers are not released with C FFI
#5897: GHC runtime task workers are not released with C FFI ---+ Reporter: sanketr | Owner: Type: bug | Status: new Priority: highest | Milestone: 7.4.2 Component: Runtime System|Version: 7.4.1 Resolution:| Keywords: worker, ffi Os: Unknown/Multiple | Architecture: x86_64 (amd64) Failure: None/Unknown | Difficulty: Unknown Testcase:| Blockedby: Blocking:|Related: #4262 ---+ Comment(by sanketr): In case it is not clear right away from the code, here is the sequence of execution from timerevent when forking FFI threads: - timerevent forks sendSignal threads - it forks n threads corresponding to n C FFI threads - timerevent now waits on a list of mvar m2 - list has n elements, one corresponding to each sendSignal thread - each sendSignal thread calls back syncWithC function - puts an element in m2, and waits on m1 - timerevent gets all m2, and puts 0 in list of m1 - Each syncWithC thread (which was called back by sendSignal) gets its corresponding m1, and finishes. sendSignal is done. Those FFI threads are now finished. So, if sendSignal is finished (otherwise the code will deadlock if mvars are not put/taken in right order), the number of worker threads shouldn't keep increasing with each iteration of timerevent. -- Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/5897#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler ___ Glasgow-haskell-bugs mailing list Glasgow-haskell-bugs@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs
Re: [GHC] #5897: GHC runtime task workers are not released with C FFI
#5897: GHC runtime task workers are not released with C FFI ---+ Reporter: sanketr | Owner: Type: bug | Status: new Priority: highest | Milestone: 7.4.2 Component: Runtime System|Version: 7.4.1 Resolution:| Keywords: worker, ffi Os: Unknown/Multiple | Architecture: x86_64 (amd64) Failure: None/Unknown | Difficulty: Unknown Testcase:| Blockedby: Blocking:|Related: #4262 ---+ Comment(by sanketr): Sorry, comment system mangled the steps. Here it is again: In case it is not clear right away from the code, here is the sequence of execution from timerevent when forking FFI threads: {{{ - timerevent forks sendSignal threads - it forks n threads corresponding to n C FFI threads - timerevent now waits on a list of mvar m2 - list has n elements, one corresponding to each sendSignal thread - each sendSignal thread calls back syncWithC function - puts an element in m2, and waits on m1 - timerevent gets all m2, and puts 0 in list of m1 - Each syncWithC thread (which was called back by sendSignal) gets its corresponding m1, and finishes. sendSignal is done. Those FFI threads are now finished. }}} So, if sendSignal is finished (otherwise the code will deadlock if mvars are not put/taken in right order), the number of worker threads shouldn't keep increasing with each iteration of timerevent. -- Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/5897#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler ___ Glasgow-haskell-bugs mailing list Glasgow-haskell-bugs@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs
Re: [GHC] #5897: GHC runtime task workers are not released with C FFI
#5897: GHC runtime task workers are not released with C FFI --+- Reporter: sanketr | Owner: Type: bug | Status: new Priority: normal| Component: Runtime System Version: 7.4.1 | Keywords: worker, ffi Os: Unknown/Multiple | Architecture: x86_64 (amd64) Failure: None/Unknown | Testcase: Blockedby:| Blocking: Related: #4262 | --+- Changes (by sanketr): * related: 4262 = #4262 -- Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/5897#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler ___ Glasgow-haskell-bugs mailing list Glasgow-haskell-bugs@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs
Re: [GHC] #5897: GHC runtime task workers are not released with C FFI
#5897: GHC runtime task workers are not released with C FFI --+- Reporter: sanketr | Owner: Type: bug | Status: new Priority: normal| Component: Runtime System Version: 7.4.1 | Keywords: worker, ffi Os: Unknown/Multiple | Architecture: x86_64 (amd64) Failure: None/Unknown | Testcase: Blockedby:| Blocking: Related: #4262 | --+- Comment(by sanketr): In case this helps locate the cause of the issue further, this is what I got once on Mac when causing seg fault (I commented out mutex locking/unlocking in C sendSignal function - so, it was just mvar callback at that point when this error happened): internal error: ARR_WORDS object entered! (GHC version 7.4.1 for x86_64_apple_darwin) Please report this as a GHC bug: http://www.haskell.org/ghc/reportabug -- Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/5897#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler ___ Glasgow-haskell-bugs mailing list Glasgow-haskell-bugs@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs
Re: [GHC] #5897: GHC runtime task workers are not released with C FFI
#5897: GHC runtime task workers are not released with C FFI --+- Reporter: sanketr | Owner: Type: bug | Status: new Priority: normal| Component: Runtime System Version: 7.4.1 | Keywords: worker, ffi Os: Unknown/Multiple | Architecture: x86_64 (amd64) Failure: None/Unknown | Testcase: Blockedby:| Blocking: Related: #4262 | --+- Comment(by sanketr): Edit: it wasn't seg fault as I noted above. I was trying to cause seg fault, but got the above error instead, once. Can't reproduce it again. -- Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/5897#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler ___ Glasgow-haskell-bugs mailing list Glasgow-haskell-bugs@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs
Re: [GHC] #5897: GHC runtime task workers are not released with C FFI
#5897: GHC runtime task workers are not released with C FFI ---+ Reporter: sanketr | Owner: simonmar Type: bug | Status: new Priority: highest | Milestone: 7.4.2 Component: Runtime System | Version: 7.4.1 Keywords: worker, ffi | Os: Unknown/Multiple Architecture: x86_64 (amd64) | Failure: None/Unknown Difficulty: Unknown |Testcase: Blockedby: |Blocking: Related: #4262 | ---+ Changes (by simonmar): * owner: = simonmar * difficulty: = Unknown * priority: normal = highest * milestone: = 7.4.2 -- Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/5897#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler ___ Glasgow-haskell-bugs mailing list Glasgow-haskell-bugs@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs
Re: [GHC] #5897: GHC runtime task workers are not released with C FFI
#5897: GHC runtime task workers are not released with C FFI ---+ Reporter: sanketr | Owner: simonmar Type: bug | Status: new Priority: highest | Milestone: 7.4.2 Component: Runtime System | Version: 7.4.1 Keywords: worker, ffi | Os: Unknown/Multiple Architecture: x86_64 (amd64) | Failure: None/Unknown Difficulty: Unknown |Testcase: Blockedby: |Blocking: Related: #4262 | ---+ Comment(by sanketr): I am attaching simplified test code, the bare minimum that reproduces the bug of unreleased worker threads, and segmentation fault (on mac 10.7.2 x86_64). The c FFI code is now much simpler. -- Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/5897#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler ___ Glasgow-haskell-bugs mailing list Glasgow-haskell-bugs@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs
[GHC] #5897: GHC runtime task workers are not released with C FFI
#5897: GHC runtime task workers are not released with C FFI --+- Reporter: sanketr | Owner: Type: bug | Status: new Priority: normal| Component: Runtime System Version: 7.4.1 | Keywords: worker, ffi Os: Unknown/Multiple | Architecture: x86_64 (amd64) Failure: None/Unknown | Testcase: Blockedby:| Blocking: Related: 4262 | --+- I have a test code which calls C FFI to collect data every n microseconds. The timer event in Haskell code spawns one thread for each C FFI thread. Those C FFI threads call back, and coordinates with calling GHC thread through mvar. What I am consistently seeing is the increase in number of runtime task workers with each iteration of timer event. The attached test code reproduces the issue (please see attached README on how to run it). I tested these on both Mac and Redhat Linux x86_64 with GHC 7.4.1, and was able to reliably reproduce the issue. The end result is that if number of C FFI threads is beyond a certain threshold (6 on my quad-core iMac), the number of runtime tasks seem to increase without bounds. For example, here is a sample RTS output from attached test code, with 7 C FFI threads, and 2 GHC threads, after two iterations of the call to C FFI - instead of 4 task workers, there are 10 - tested with -N3 +RTS -s on GHC 7.4.1 and Mac 10.7.2 (quad-core iMac): {{{ Parallel GC work balance: nan (0 / 0, ideal 1) MUT time (elapsed) GC time (elapsed) Task 0 (worker) :0.00s( 0.42s) 0.00s( 0.00s) Task 1 (worker) :0.00s( 0.00s) 0.00s( 0.00s) Task 2 (worker) :0.00s( 0.93s) 0.00s( 0.00s) Task 3 (worker) :0.00s( 0.93s) 0.00s( 0.00s) Task 4 (worker) :0.00s( 0.93s) 0.00s( 0.00s) Task 5 (worker) :0.00s( 0.93s) 0.00s( 0.00s) Task 6 (worker) :0.00s( 0.93s) 0.00s( 0.00s) Task 7 (worker) :0.06s( 1.00s) 0.00s( 0.00s) Task 8 (worker) :0.00s( 0.00s) 0.00s( 0.00s) Task 9 (worker) :0.06s( 1.43s) 0.00s( 0.00s) Task 10 (bound) :0.00s( 0.00s) 0.00s( 0.00s) }}} The culprit seems to be mvar callback by C FFI. If I remove mvar callback, the number of task workers stay constant at 4. If this is a bug in GHC runtime and not my code, it seems to be a big bug because mvar callback is important for coordination with C FFI threads. This bug might have been in previous versions of GHC as well, but probably not discovered because it seems to require a certain C FFI thread count threshold to kick in. -- Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/5897 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler ___ Glasgow-haskell-bugs mailing list Glasgow-haskell-bugs@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs
Re: [GHC] #5897: GHC runtime task workers are not released with C FFI
#5897: GHC runtime task workers are not released with C FFI --+- Reporter: sanketr | Owner: Type: bug | Status: new Priority: normal| Component: Runtime System Version: 7.4.1 | Keywords: worker, ffi Os: Unknown/Multiple | Architecture: x86_64 (amd64) Failure: None/Unknown | Testcase: Blockedby:| Blocking: Related: 4262 | --+- Comment(by sanketr): Just figured out how to reproduce the segmentation fault/bus error I have been seeing with more complicated code this test code is derived from. The test code will consistently produce bus error or segmentation fault on Mac OS 10.7.2, after twenty or so iterations, if I set nThreads in main of T.hs to say 30. It seems to crash in pthread, during call to lock mutex in sendSignal C code - I wonder if it is related to some kind of mutex leak in interaction between GHC RTS and C FFI. gdb output from core below: {{{ $ gdb ./T /cores/core.2372 GNU gdb 6.3.50-20050815 (Apple version gdb-1705) (Fri Jul 1 10:50:06 UTC 2011) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as x86_64-apple-darwin...Reading symbols for shared libraries ... done Reading symbols for shared libraries . done Reading symbols for shared libraries ... done #0 0x7fff9321fbca in __psynch_cvwait () (gdb) }}} I can reproduce the crash consistently on quad-core iMac with -N3 runtime option. -- Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/5897#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler ___ Glasgow-haskell-bugs mailing list Glasgow-haskell-bugs@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs