Hi!

Please forgive me that i'm a lazy slacker, even too lazy to send a
proper git patch, but anyway here's a small simplifcation diff for the
SIGTSTP handler of GHC's rts (explanations below):

--- rts/posix/Signals.c.orig    Mon Jun 13 19:10:06 2011
+++ rts/posix/Signals.c Tue Dec 27 19:58:52 2011
@@ -494,7 +494,7 @@ empty_handler (int sig STG_UNUSED)
 
    The trick we use is:
      - catch SIGTSTP
-     - in the handler,  kill(getpid(),SIGTSTP)
+     - in the handler,  kill(getpid(),SIGSTOP)
      - when this returns, restore the TTY settings
    This means we don't have to catch SIGCONT too.
 
@@ -516,17 +516,8 @@ sigtstp_handler (int sig)
         }
     }
 
-    // de-install the SIGTSTP handler
-    set_sigtstp_action(rtsFalse);
-
     // really stop the process now
-    {
-        sigset_t mask;
-        sigemptyset(&mask);
-        sigaddset(&mask, sig);
-        sigprocmask(SIG_UNBLOCK, &mask, NULL);
-        kill(getpid(), sig);
-    }
+    kill(getpid(), SIGSTOP);
 
     // on return, restore the TTY state
     for (fd = 0; fd <= 2; fd++) {
@@ -534,8 +525,6 @@ sigtstp_handler (int sig)
             tcsetattr(0,TCSANOW,&ts[fd]);
         }
     }
-
-    set_sigtstp_action(rtsTrue);
 }
 
 static void


The reason why I need this diff (for ghc-7.0.4, but it should equally
apply on the ghc-7.4 branch) is a little bit embarassing: the current
pthreads implementation on OpenBSD is -- err -- a little bit limited.
Especially switching the handler for a signal to SIG_DFL from a
signal handler triggered by the very same signal doesn't have any
effect.

This means, if you try to suspend *any* Haskell program on OpenBSD
that is linked with -lpthread (or, more correctly with -pthread),
the program doesn't stop but start to send itself an endless chain
of SIGTSTP signals. Ouch!

Here's a ktrace/kdump output of a program in that poor state:

  3088 ghc      PSIG  SIGTSTP caught handler=0x20221d940 mask=0x4002000
  3088 ghc      RET   sigreturn JUSTRETURN
  3088 ghc      CALL  sigaction(SIGTSTP,0x7f7ffffe55f0,0)
  3088 ghc      RET   sigaction 0
  3088 ghc      CALL  getpid()
  3088 ghc      RET   getpid 3088/0xc10
  3088 ghc      CALL  kill(0xc10,SIGTSTP)
  3088 ghc      RET   kill 0
  3088 ghc      CALL  sigaction(SIGTSTP,0x7f7ffffe55f0,0)
  3088 ghc      RET   sigaction 0
  3088 ghc      CALL  sigreturn(0x7f7ffffe57e0)
  3088 ghc      PSIG  SIGTSTP caught handler=0x20221d940 mask=0x4002000
  3088 ghc      RET   sigreturn JUSTRETURN
  3088 ghc      CALL  sigaction(SIGTSTP,0x7f7ffffe55f0,0)
  3088 ghc      RET   sigaction 0
  3088 ghc      CALL  getpid()
  3088 ghc      RET   getpid 3088/0xc10
  3088 ghc      CALL  kill(0xc10,SIGTSTP)
  3088 ghc      RET   kill 0
  3088 ghc      CALL  sigaction(SIGTSTP,0x7f7ffffe55f0,0)
  3088 ghc      RET   sigaction 0
  3088 ghc      CALL  sigreturn(0x7f7ffffe57e0)
  3088 ghc      PSIG  SIGTSTP caught handler=0x20221d940 mask=0x4002000
[repeated ad infinitum]

While this is caused by an OpenBSD bug, it may be appropriate to
apply the patch for all operating systems, because it makes the
handler a little bit smaller and simpler (and looking less recursive,
because the SIGTSTP handler now just raises a SIGSTOP.

It also avoids calling sigprocmask(2), which leads to unspecified
behaviour in a multi-threaded program -- at least according to POSIX
(you'd have to use pthread_sigmask(3) for it).

What do you think? Are there any potential problems with the above
diff?

In any case, this shouldn't be applied without massive testing on
at least tier-1 platforms (or without a comment from Simon Marlow
who has much more knowledge than me in this area).

Ciao,
        Kili

_______________________________________________
Cvs-ghc mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/cvs-ghc

Reply via email to