Hello, some time ago I received a mail from Sandro Tosi, with comments from Raphaël Hertzog about some unending darcs processes running with 100% CPU. From that time I'm working with fixing the bug http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=522617 in Alioth. The problem is that this bug is present using darcs push, which is a very common command in the Debian Haskell Team workflow. For more information about the bug, see http://bugs.darcs.net/issue1278 .
It's not a bug in darcs, but in GHC, and it was fixed in version 6.10.2. It's only present in 64 bits architectures. At first I talked to b...@#[email protected], which suggested backporting the newer GHC and then darcs, to make it possible to install it in alioth. This would require to backport a lot of haskell libraries only present in sid, and formo...@#[email protected] suggested me to correct it in lenny. I talked about that with mornf...@#[email protected] , and he pointed to me that this is probably related to these two patches in the GHC darcs repository: Thu Nov 13 14:00:05 BRST 2008 Simon Marlow <[email protected]> * Fix another subtle shutdown deadlock The problem occurred when a thread tries to GC during shutdown. In order to GC it has to acquire all the Capabilities in the system, but during shutdown, some of the Capabilities have already been closed and can never be acquired. Thu Nov 13 13:57:30 BRST 2008 Simon Marlow <[email protected]> * Fix an extremely subtle deadlock bug on x86_64 The recent_activity flag was an unsigned int, but we sometimes do a 64-bit xchg() on it, which overwrites the next word in memory. This happened to contain the sched_state flag, which is used to control the orderly shutdown of the system. If the xchg() happened during shutdown, the scheduler would get confused and deadlock. Don't you just love C? I applied the second one to ghc6_6.8.2dsfg1-1 and the bug is not present anymore. As the package is small and it's a very annoying bug, I thought that the suggestion from mornfall was a good option. The patch is: { hunk ./rts/Schedule.c 95 + * + * NB. must be StgWord, we do xchg() on it. hunk ./rts/Schedule.c 98 -nat recent_activity = ACTIVITY_YES; +volatile StgWord recent_activity = ACTIVITY_YES; hunk ./rts/Schedule.c 101 - * LOCK: none (changes once, from false->true) + * LOCK: none (changes monotonically) hunk ./rts/Schedule.c 103 -rtsBool sched_state = SCHED_RUNNING; +volatile StgWord sched_state = SCHED_RUNNING; hunk ./rts/Schedule.h 100 -extern rtsBool RTS_VAR(sched_state); +extern volatile StgWord RTS_VAR(sched_state); hunk ./rts/Schedule.h 116 -extern nat recent_activity; +extern volatile StgWord recent_activity; } Kaol, do you think it's a good idea do incorporate this to the lenny's GHC? This would not require to rebuild all libraries. Or is it a better option to use the sid ghc to rebuild a new darcs binary and install it on alioth? Or to upload the whole Haskell stack to backports? Please give me hints. Greetings. -- marcot http://marcot.iaaeee.org/ -- To UNSUBSCRIBE, email to [email protected] with a subject of "unsubscribe". Trouble? Contact [email protected]
