Re: [hackers] Re: [dwm][PATCH v2] Use sigaction(SA_NOCLDWAIT) for SIGCHLD handling

2023-01-28 Thread NRK
On Sat, Jan 28, 2023 at 12:53:48PM +0100, Hiltjo Posthuma wrote:
> waitpid() and sigaction() can also fail with EINTR, which may mean our zombie
> handling fails, so block signals while setting things up to be careful.

IMO this line should be removed from the commit message because:

* It's not accurate (afaik) becasuse EINTR isn't specified for `sigaction`:
  https://www.man7.org/linux/man-pages/man3/sigaction.3p.html#ERRORS
* the signal blocking got removed from the final patch anyways.

But otherwise it's looking OK to me :)

- NRK



Re: [hackers] Re: [dwm][PATCH v2] Use sigaction(SA_NOCLDWAIT) for SIGCHLD handling

2023-01-28 Thread Hiltjo Posthuma
On Sat, Jan 28, 2023 at 03:04:50PM +0600, NRK wrote:
> Hi Hiltjo,
> 
> On Sat, Jan 28, 2023 at 12:11:44AM +0100, Hiltjo Posthuma wrote:
> > We do not need to waitpid() on child processes. It is handled already
> > by using sigaction().
> 
> Here's a test case. I see `cat` turning into a zombie on my system
> without the `waitpid`:
> 
>   [/tmp]~> cat test.c 
>   #include 
>   #include 
>   #include 
>   #include 
>   #include 
>   
>   int main(void)
>   {
>   struct sigaction sa;
> 
>   puts("waiting for cat to become a zombie");
>   sleep(1);
> 
>   sigemptyset(_mask);
>   sa.sa_flags = SA_NOCLDSTOP | SA_NOCLDWAIT | SA_RESTART;
>   sa.sa_handler = SIG_IGN;
>   sigaction(SIGCHLD, , NULL);
> 
>   puts("hello there");
>   (void)getchar();
>   }
>   [/tmp]~> cat test.sh 
>   #!/bin/sh
>   
>   cat &
>   exec ./test
>   [/tmp]~> cc -o test test.c
>   [/tmp]~> ./test.sh 
>   waiting for cat to become a zombie
>   hello there
> 
> Just putting `while (waitpid(-1, NULL, WNOHANG) > 0);` after the
> `sigaction` call (without worrying about EINTR) should still be better
> than not calling `waitpid` at all IMO.
> 
> - NRK
> 

Hi,

OK thanks, below is the final patch I intend to commit:

>From ca522f8f0e4c9ccb0d5203536abb118db8b36d7f Mon Sep 17 00:00:00 2001
From: Chris Down 
Date: Sat, 28 Jan 2023 12:39:28 +0100
Subject: [PATCH] Use sigaction(SA_NOCLDWAIT) for SIGCHLD handling

signal() semantics are pretty unclearly specified. For example, depending on OS
kernel and libc, the handler may be returned to SIG_DFL (hence the inner call
to read the signal handler). Moving to sigaction() means the behaviour is
consistently defined.

Using SA_NOCLDWAIT also allows us to avoid calling the non-reentrant function
die() in the handler.

waitpid() and sigaction() can also fail with EINTR, which may mean our zombie
handling fails, so block signals while setting things up to be careful.

Some addditional notes for archival purposes:

* NRK pointed out errno of waitpid could also theoretically get clobbered.
* The original patch was iterated on and modified by NRK and Hiltjo:
  * SIG_DFL was changed to SIG_IGN, this is required, atleast on older systems
such as tested on Slackware 11.
  * signals are not blocked using sigprocmask, because in theory it would
briefly for example also ignore a SIGTERM signal. It is OK if waitpid() is 
(in
theory interrupted).

POSIX reference:
"Consequences of Process Termination":
https://pubs.opengroup.org/onlinepubs/9699919799/functions/_Exit.html#tag_16_01_03_01
---
 dwm.c | 20 +---
 1 file changed, 9 insertions(+), 11 deletions(-)

diff --git a/dwm.c b/dwm.c
index 03baf42..88c3947 100644
--- a/dwm.c
+++ b/dwm.c
@@ -205,7 +205,6 @@ static void setmfact(const Arg *arg);
 static void setup(void);
 static void seturgent(Client *c, int urg);
 static void showhide(Client *c);
-static void sigchld(int unused);
 static void spawn(const Arg *arg);
 static void tag(const Arg *arg);
 static void tagmon(const Arg *arg);
@@ -1543,9 +1542,16 @@ setup(void)
int i;
XSetWindowAttributes wa;
Atom utf8string;
+   struct sigaction sa;
 
-   /* clean up any zombies immediately */
-   sigchld(0);
+   /* do not transform children into zombies when they terminate. */
+   sigemptyset(_mask);
+   sa.sa_flags = SA_NOCLDSTOP | SA_NOCLDWAIT | SA_RESTART;
+   sa.sa_handler = SIG_IGN;
+   sigaction(SIGCHLD, , NULL);
+
+   /* clean up any zombies (inherited from .xinitrc etc) immediately */
+   while (waitpid(-1, NULL, WNOHANG) > 0);
 
/* init screen */
screen = DefaultScreen(dpy);
@@ -1638,14 +1644,6 @@ showhide(Client *c)
}
 }
 
-void
-sigchld(int unused)
-{
-   if (signal(SIGCHLD, sigchld) == SIG_ERR)
-   die("can't install SIGCHLD handler:");
-   while (0 < waitpid(-1, NULL, WNOHANG));
-}
-
 void
 spawn(const Arg *arg)
 {
-- 
2.39.1

-- 
Kind regards,
Hiltjo



Re: [hackers] Re: [dwm][PATCH v2] Use sigaction(SA_NOCLDWAIT) for SIGCHLD handling

2023-01-28 Thread NRK
Hi Hiltjo,

On Sat, Jan 28, 2023 at 12:11:44AM +0100, Hiltjo Posthuma wrote:
> We do not need to waitpid() on child processes. It is handled already
> by using sigaction().

Here's a test case. I see `cat` turning into a zombie on my system
without the `waitpid`:

[/tmp]~> cat test.c 
#include 
#include 
#include 
#include 
#include 

int main(void)
{
struct sigaction sa;

puts("waiting for cat to become a zombie");
sleep(1);

sigemptyset(_mask);
sa.sa_flags = SA_NOCLDSTOP | SA_NOCLDWAIT | SA_RESTART;
sa.sa_handler = SIG_IGN;
sigaction(SIGCHLD, , NULL);

puts("hello there");
(void)getchar();
}
[/tmp]~> cat test.sh 
#!/bin/sh

cat &
exec ./test
[/tmp]~> cc -o test test.c
[/tmp]~> ./test.sh 
waiting for cat to become a zombie
hello there

Just putting `while (waitpid(-1, NULL, WNOHANG) > 0);` after the
`sigaction` call (without worrying about EINTR) should still be better
than not calling `waitpid` at all IMO.

- NRK



Re: [hackers] Re: [dwm][PATCH v2] Use sigaction(SA_NOCLDWAIT) for SIGCHLD handling

2023-01-27 Thread Hiltjo Posthuma
On Thu, Jan 26, 2023 at 07:44:08PM +0600, NRK wrote:
> On Wed, Jan 25, 2023 at 09:17:53PM +0100, Hiltjo Posthuma wrote:
> > Using the new patch it does not handle zombie/defunct processes anymore on 
> > this
> > machine. To reproduce it, in .xinitrc, on a dusty machine:
> > 
> > sleep 10 &
> > exec dwm
> 
> For ease of testing I wrote this dummy program. Both SIG_DFL and SIG_IGN
> works on my system (glibc 2.36 & linux 5.15.13).
> 
>   [/tmp]~> cat test.c 
>   #include 
>   #include 
>   #include 
>   #include 
>   
>   int main(void)
>   {
>   const struct sigaction sc = {
>   .sa_handler = SIG_DFL,
>   .sa_flags = SA_RESTART | SA_NOCLDWAIT | SA_NOCLDSTOP,
>   };
>   if (sigaction(SIGCHLD, , NULL) < 0)
>   abort();
>   while (waitpid(-1, NULL, WNOHANG) > 0);
>   puts("hello there");
>   (void)getchar();
>   }
>   [/tmp]~> cc -o test test.c
>   [/tmp]~> cat test.sh 
>   #!/bin/sh
>   
>   sleep 4 &
>   exec ./test
>   [/tmp]~> ./test.sh
> 
> And the `signal(3p)` version also works:
> 
>   int main(void)
>   {
>   if (signal(SIGCHLD, SIG_IGN) < 0)
>   abort();
>   while (waitpid(-1, NULL, WNOHANG) > 0);
>   puts("hello there");
>   (void)getchar();
>   }
> 
> > Also, maybe it is better to not call sigprocmask? Since now it blocks all
> > signals, including SIGTERM in setup().
> 
> @Chris: I'm looking at the POSIX manpage for sigaction and I don't think
> it can fail due to interrupt. Only `EINVAL` is specified:
> https://www.man7.org/linux/man-pages/man3/sigaction.3p.html#ERRORS
> 
> So getting rid of the `sigprocmask` and manually testing for EINTR when
> reaping inherited zombies should be fine, I think.
> 
> - NRK
> 

Hi,

I've looked at it a bit more and tested on more machines: OpenBSD, Slackware 11
and Void Linux. I think the simplified below patch should be fine.  We do not
need to waitpid() on child processes. It is handled already by using
sigaction().

We should not (briefly) block all signals (including SIGTERM!) on setup I think.

diff --git a/dwm.c b/dwm.c
index 03baf42..371bada 100644
--- a/dwm.c
+++ b/dwm.c
@@ -205,7 +205,6 @@ static void setmfact(const Arg *arg);
 static void setup(void);
 static void seturgent(Client *c, int urg);
 static void showhide(Client *c);
-static void sigchld(int unused);
 static void spawn(const Arg *arg);
 static void tag(const Arg *arg);
 static void tagmon(const Arg *arg);
@@ -1540,12 +1539,16 @@ setmfact(const Arg *arg)
 void
 setup(void)
 {
+   struct sigaction sa;
int i;
XSetWindowAttributes wa;
Atom utf8string;
 
-   /* clean up any zombies immediately */
-   sigchld(0);
+   /* do not transform children into zombies when they terminate. */
+   sigemptyset(_mask);
+   sa.sa_flags = SA_NOCLDSTOP | SA_NOCLDWAIT | SA_RESTART;
+   sa.sa_handler = SIG_IGN;
+   sigaction(SIGCHLD, , NULL);
 
/* init screen */
screen = DefaultScreen(dpy);
@@ -1638,14 +1641,6 @@ showhide(Client *c)
}
 }
 
-void
-sigchld(int unused)
-{
-   if (signal(SIGCHLD, sigchld) == SIG_ERR)
-   die("can't install SIGCHLD handler:");
-   while (0 < waitpid(-1, NULL, WNOHANG));
-}
-
 void
 spawn(const Arg *arg)
 {

-- 
Kind regards,
Hiltjo



Re: [hackers] Re: [dwm][PATCH v2] Use sigaction(SA_NOCLDWAIT) for SIGCHLD handling

2023-01-26 Thread NRK
On Wed, Jan 25, 2023 at 09:17:53PM +0100, Hiltjo Posthuma wrote:
> Using the new patch it does not handle zombie/defunct processes anymore on 
> this
> machine. To reproduce it, in .xinitrc, on a dusty machine:
> 
>   sleep 10 &
>   exec dwm

For ease of testing I wrote this dummy program. Both SIG_DFL and SIG_IGN
works on my system (glibc 2.36 & linux 5.15.13).

[/tmp]~> cat test.c 
#include 
#include 
#include 
#include 

int main(void)
{
const struct sigaction sc = {
.sa_handler = SIG_DFL,
.sa_flags = SA_RESTART | SA_NOCLDWAIT | SA_NOCLDSTOP,
};
if (sigaction(SIGCHLD, , NULL) < 0)
abort();
while (waitpid(-1, NULL, WNOHANG) > 0);
puts("hello there");
(void)getchar();
}
[/tmp]~> cc -o test test.c
[/tmp]~> cat test.sh 
#!/bin/sh

sleep 4 &
exec ./test
[/tmp]~> ./test.sh

And the `signal(3p)` version also works:

int main(void)
{
if (signal(SIGCHLD, SIG_IGN) < 0)
abort();
while (waitpid(-1, NULL, WNOHANG) > 0);
puts("hello there");
(void)getchar();
}

> Also, maybe it is better to not call sigprocmask? Since now it blocks all
> signals, including SIGTERM in setup().

@Chris: I'm looking at the POSIX manpage for sigaction and I don't think
it can fail due to interrupt. Only `EINVAL` is specified:
https://www.man7.org/linux/man-pages/man3/sigaction.3p.html#ERRORS

So getting rid of the `sigprocmask` and manually testing for EINTR when
reaping inherited zombies should be fine, I think.

- NRK



Re: [hackers] Re: [dwm][PATCH v2] Use sigaction(SA_NOCLDWAIT) for SIGCHLD handling

2023-01-25 Thread Hiltjo Posthuma
On Wed, Jan 25, 2023 at 04:24:36PM +0600, NRK wrote:
> On Wed, Jan 25, 2023 at 09:37:58AM +0100, Hiltjo Posthuma wrote:
> > I don't think there can be any zombies since it is run in setup() before any
> > processes can spawn (via keybindings etc) or am I overlooking something
> > obvious?
> 
> It was recently discussed here: 
> https://lists.suckless.org/hackers/2207/18424.html
> And here's the original thread from 2009: 
> https://lists.suckless.org/dev/0908/0774.html
> 
> - NRK
> 

Thanks, yeah I'm also digging in the history now.

I just tried it on a Slackware 11 machine, because I'm curious what it might 
break etc.

Using the new patch it does not handle zombie/defunct processes anymore on this
machine. To reproduce it, in .xinitrc, on a dusty machine:

sleep 10 &
exec dwm

Run ps in the terminal, after the sleep process exits it is defunct.  This used
to work without the patch.

I changed SIG_DFL to SIG_IGN and it seems to work.

Also, maybe it is better to not call sigprocmask? Since now it blocks all
signals, including SIGTERM in setup().

-- 
Kind regards,
Hiltjo



Re: [hackers] Re: [dwm][PATCH v2] Use sigaction(SA_NOCLDWAIT) for SIGCHLD handling

2023-01-25 Thread Hiltjo Posthuma
On Wed, Jan 25, 2023 at 08:39:51AM +, Chris Down wrote:
> Hiltjo Posthuma writes:
> > I don't think there can be any zombies since it is run in setup() before any
> > processes can spawn (via keybindings etc) or am I overlooking something
> > obvious?
> 
> You can have zombies from what's spawned in .xinitrc or similar, because
> usually the final step is `exec dwm'. :-)
> 

Thanks, yeah I also realised it after a few moments I posted this heh,

Doh

-- 
Kind regards,
Hiltjo



Re: [hackers] Re: [dwm][PATCH v2] Use sigaction(SA_NOCLDWAIT) for SIGCHLD handling

2023-01-25 Thread Chris Down

Hiltjo Posthuma writes:

I don't think there can be any zombies since it is run in setup() before any
processes can spawn (via keybindings etc) or am I overlooking something
obvious?


You can have zombies from what's spawned in .xinitrc or similar, because 
usually the final step is `exec dwm'. :-)




Re: [hackers] Re: [dwm][PATCH v2] Use sigaction(SA_NOCLDWAIT) for SIGCHLD handling

2023-01-25 Thread NRK
On Wed, Jan 25, 2023 at 09:37:58AM +0100, Hiltjo Posthuma wrote:
> I don't think there can be any zombies since it is run in setup() before any
> processes can spawn (via keybindings etc) or am I overlooking something
> obvious?

It was recently discussed here: 
https://lists.suckless.org/hackers/2207/18424.html
And here's the original thread from 2009: 
https://lists.suckless.org/dev/0908/0774.html

- NRK



Re: [hackers] Re: [dwm][PATCH v2] Use sigaction(SA_NOCLDWAIT) for SIGCHLD handling

2023-01-25 Thread Hiltjo Posthuma
On Wed, Jan 25, 2023 at 12:06:25PM +0600, NRK wrote:
> On Tue, Jan 24, 2023 at 09:01:08PM +0100, Hiltjo Posthuma wrote:
> > Although of course checking errno on a success condition is a bit wonky in 
> > this test case.
> 
> It was just an illustration that the malloc succeeded :)
> A more real-world check would be something like this (in fact, I'm quite
> sure this is precisely the cause of the xlib XIO error):
> 
>   if (poll(...) < 0) {
>   if (errno == EINTR)
>   continue;
>   else
>   error(...);
>   }
> 
> > What about a simplified version of a patch below, it should do a similar 
> > thing:
> >  
> > /* clean up any zombies immediately */
> > -   sigchld(0);
> > +   if (signal(SIGCHLD, SIG_IGN) == SIG_ERR)
> > +   die("can't ignore SIGCHLD:");
> 
> One issue here is that this will not clean up any existing zombies
> (inherited from .xinitrc etc).
> 
> That's what that "clean up any zombies immediately" comment is referring
> to. I think that comment ought to be a bit more descriptive and mention
> where these zombies are coming from since it's not immediately obvious.
> 
> One other thing worth mentioning is that this behavior wasn't well
> defined on earlier POSIX versions so some historical BSDs may have
> different behavior. From 
> (https://man7.org/linux/man-pages/man2/sigaction.2.html#NOTES):
> 
> | POSIX.1-1990 disallowed setting the action for SIGCHLD to
> | SIG_IGN.  POSIX.1-2001 and later allow this possibility, so that
> | ignoring SIGCHLD can be used to prevent the creation of zombies.
> | Nevertheless, the historical BSD and System V behaviors for ignoring
> | SIGCHLD differ
> 
> The POSIX manpage for signal(3p) also says that "new applications should
> use sigaction() rather than signal()" - probably due to these
> incompatibility reasons.
> 
> But in any case, if you don't want to use `sigaction`, I think reaping
> the existing zombies after the `signal` call _should_ work:
> 
>   -   /* clean up any zombies immediately */
>   -   sigchld(0);
>   +   if (signal(SIGCHLD, SIG_IGN) == SIG_ERR)
>   +   die("can't ignore SIGCHLD:");
>   +   /* clean up any zombies (inherited from .xinitrc etc) 
> immediately */
>   +   while (waitpid(-1, NULL, WNOHANG) > 0);
> 
> I haven't tested (or even compiled) the above patch. And it seems that
> signals (much like threads) are really easy to mess up, so feel free to
> correct if anything is wrong.
> 
> - NRK
> 

I don't think there can be any zombies since it is run in setup() before any
processes can spawn (via keybindings etc) or am I overlooking something
obvious?

-- 
Kind regards,
Hiltjo



Re: [hackers] Re: [dwm][PATCH v2] Use sigaction(SA_NOCLDWAIT) for SIGCHLD handling

2023-01-24 Thread NRK
On Tue, Jan 24, 2023 at 09:01:08PM +0100, Hiltjo Posthuma wrote:
> Although of course checking errno on a success condition is a bit wonky in 
> this test case.

It was just an illustration that the malloc succeeded :)
A more real-world check would be something like this (in fact, I'm quite
sure this is precisely the cause of the xlib XIO error):

if (poll(...) < 0) {
if (errno == EINTR)
continue;
else
error(...);
}

> What about a simplified version of a patch below, it should do a similar 
> thing:
>  
>   /* clean up any zombies immediately */
> - sigchld(0);
> + if (signal(SIGCHLD, SIG_IGN) == SIG_ERR)
> + die("can't ignore SIGCHLD:");

One issue here is that this will not clean up any existing zombies
(inherited from .xinitrc etc).

That's what that "clean up any zombies immediately" comment is referring
to. I think that comment ought to be a bit more descriptive and mention
where these zombies are coming from since it's not immediately obvious.

One other thing worth mentioning is that this behavior wasn't well
defined on earlier POSIX versions so some historical BSDs may have
different behavior. From 
(https://man7.org/linux/man-pages/man2/sigaction.2.html#NOTES):

| POSIX.1-1990 disallowed setting the action for SIGCHLD to
| SIG_IGN.  POSIX.1-2001 and later allow this possibility, so that
| ignoring SIGCHLD can be used to prevent the creation of zombies.
| Nevertheless, the historical BSD and System V behaviors for ignoring
| SIGCHLD differ

The POSIX manpage for signal(3p) also says that "new applications should
use sigaction() rather than signal()" - probably due to these
incompatibility reasons.

But in any case, if you don't want to use `sigaction`, I think reaping
the existing zombies after the `signal` call _should_ work:

-   /* clean up any zombies immediately */
-   sigchld(0);
+   if (signal(SIGCHLD, SIG_IGN) == SIG_ERR)
+   die("can't ignore SIGCHLD:");
+   /* clean up any zombies (inherited from .xinitrc etc) 
immediately */
+   while (waitpid(-1, NULL, WNOHANG) > 0);

I haven't tested (or even compiled) the above patch. And it seems that
signals (much like threads) are really easy to mess up, so feel free to
correct if anything is wrong.

- NRK



Re: [hackers] Re: [dwm][PATCH v2] Use sigaction(SA_NOCLDWAIT) for SIGCHLD handling

2023-01-24 Thread Chris Down

Hiltjo Posthuma writes:

-   sigchld(0);
+   if (signal(SIGCHLD, SIG_IGN) == SIG_ERR)
+   die("can't ignore SIGCHLD:");


I didn't test it yet, so maybe I'm misremembering/misreading the patch, but I 
don't think this will work: SIG_IGN will ignore new children, but it won't 
clean up any which are already ready to reap. That's why you need careful 
ordering of SIG_IGN and waitpid().




Re: [hackers] Re: [dwm][PATCH v2] Use sigaction(SA_NOCLDWAIT) for SIGCHLD handling

2023-01-24 Thread Hiltjo Posthuma
On Tue, Jan 24, 2023 at 10:25:11AM +0600, NRK wrote:
> On Mon, Jan 23, 2023 at 08:05:19PM +0100, Hiltjo Posthuma wrote:
> > Do you perhaps also have some simple way to reproduce where it causes 
> > issues in
> > some real world use-case?
> > 
> > Ideally some command or script to reproduce the issue.
> 
> To trigger the issue, you need to have 3 conditions met:
> 
> 1. Recieve a SIGCHLD.
> 2. `waitpid` failing and spoiling the `errno`.
> 3. Being unlucky that the resuming code gets affected by the `errno`.
> 
> Here's a simple dummy program demonstrating the issue. Compile it, run
> it and then send some SIGCHLDs to it:
> 
>   [/tmp]~> cat sigtest.c
>   #include 
>   #include 
>   #include 
>   #include 
>   #include 
>   
>   void
>   sigchld(int unused)
>   {
>   if (signal(SIGCHLD, sigchld) == SIG_ERR)
>   _Exit(1);
>   while (0 < waitpid(-1, NULL, WNOHANG));
>   }
>   
>   int main(void)
>   {
>   sigchld(0);
>   while (1) {
>   errno = 0;
>   char *p = malloc(8);
>   if (p != NULL && errno) {
>   perror("sigtest: ");
>   return 1;
>   }
>   free(p);
>   }
>   }
>   [/tmp]~> cc -o sigtest sigtest.c
>   [/tmp]~> ./sigtest &
>   [1] 9363
>   [/tmp]~> while pidof sigtest >/dev/null; do kill -s SIGCHLD $(pidof 
> sigtest); done
>   sigtest: : No child processes
> 
> If you're asking for `dwm` in specific - then trigger condition 2 is
> difficult (impossible ??) since the window manager will typically have
> some children. But if you insert the following, to simulate failure:
> 
>   while (0 < waitpid(-1, NULL, WNOHANG));
>   +   errno = ECHILD;
> 
> then the issue is easily reproducable by sending some SIGCHLDs to dwm
> (in quick succession in order to maximize chances of triggering
> condition 3):
> 
>   var=$(pidof dwm); while :; do kill -s SIGCHLD $var; done
> 
> And dwm will crash with a XIO error with in a couple seconds on my
> system.
> 
> - NRK
> 

Thanks, I can see how it can be an issue clobbering errno potentially etc.
Although of course checking errno on a success condition is a bit wonky in this 
test case.

What about a simplified version of a patch below, it should do a similar thing:



>From d628a1081a2403e3b703b2f1acbff1f0ca148956 Mon Sep 17 00:00:00 2001
From: Hiltjo Posthuma 
Date: Tue, 24 Jan 2023 20:53:56 +0100
Subject: [PATCH] simplify ignoring SIGCHLD

... and handle in a way with less possible race-conditions.

https://pubs.opengroup.org/onlinepubs/9699919799/functions/_Exit.html#tag_16_01_03_01

"[XSI] [Option Start] If the parent process of the calling process has set its
SA_NOCLDWAIT flag or has set the action for the SIGCHLD signal to SIG_IGN:

[...] The lifetime of the calling process shall end immediately. If
SA_NOCLDWAIT is set, it is implementation-defined whether a SIGCHLD signal is
sent to the parent process. [...]"
---
 dwm.c | 12 ++--
 1 file changed, 2 insertions(+), 10 deletions(-)

diff --git a/dwm.c b/dwm.c
index 03baf42..b3c1fc6 100644
--- a/dwm.c
+++ b/dwm.c
@@ -205,7 +205,6 @@ static void setmfact(const Arg *arg);
 static void setup(void);
 static void seturgent(Client *c, int urg);
 static void showhide(Client *c);
-static void sigchld(int unused);
 static void spawn(const Arg *arg);
 static void tag(const Arg *arg);
 static void tagmon(const Arg *arg);
@@ -1545,7 +1544,8 @@ setup(void)
Atom utf8string;
 
/* clean up any zombies immediately */
-   sigchld(0);
+   if (signal(SIGCHLD, SIG_IGN) == SIG_ERR)
+   die("can't ignore SIGCHLD:");
 
/* init screen */
screen = DefaultScreen(dpy);
@@ -1638,14 +1638,6 @@ showhide(Client *c)
}
 }
 
-void
-sigchld(int unused)
-{
-   if (signal(SIGCHLD, sigchld) == SIG_ERR)
-   die("can't install SIGCHLD handler:");
-   while (0 < waitpid(-1, NULL, WNOHANG));
-}
-
 void
 spawn(const Arg *arg)
 {
-- 
2.39.1

-- 
Kind regards,
Hiltjo



Re: [hackers] Re: [dwm][PATCH v2] Use sigaction(SA_NOCLDWAIT) for SIGCHLD handling

2023-01-23 Thread NRK
On Mon, Jan 23, 2023 at 08:05:19PM +0100, Hiltjo Posthuma wrote:
> Do you perhaps also have some simple way to reproduce where it causes issues 
> in
> some real world use-case?
> 
> Ideally some command or script to reproduce the issue.

To trigger the issue, you need to have 3 conditions met:

1. Recieve a SIGCHLD.
2. `waitpid` failing and spoiling the `errno`.
3. Being unlucky that the resuming code gets affected by the `errno`.

Here's a simple dummy program demonstrating the issue. Compile it, run
it and then send some SIGCHLDs to it:

[/tmp]~> cat sigtest.c
#include 
#include 
#include 
#include 
#include 

void
sigchld(int unused)
{
if (signal(SIGCHLD, sigchld) == SIG_ERR)
_Exit(1);
while (0 < waitpid(-1, NULL, WNOHANG));
}

int main(void)
{
sigchld(0);
while (1) {
errno = 0;
char *p = malloc(8);
if (p != NULL && errno) {
perror("sigtest: ");
return 1;
}
free(p);
}
}
[/tmp]~> cc -o sigtest sigtest.c
[/tmp]~> ./sigtest &
[1] 9363
[/tmp]~> while pidof sigtest >/dev/null; do kill -s SIGCHLD $(pidof 
sigtest); done
sigtest: : No child processes

If you're asking for `dwm` in specific - then trigger condition 2 is
difficult (impossible ??) since the window manager will typically have
some children. But if you insert the following, to simulate failure:

while (0 < waitpid(-1, NULL, WNOHANG));
+   errno = ECHILD;

then the issue is easily reproducable by sending some SIGCHLDs to dwm
(in quick succession in order to maximize chances of triggering
condition 3):

var=$(pidof dwm); while :; do kill -s SIGCHLD $var; done

And dwm will crash with a XIO error with in a couple seconds on my
system.

- NRK



Re: [hackers] Re: [dwm][PATCH v2] Use sigaction(SA_NOCLDWAIT) for SIGCHLD handling

2023-01-23 Thread Hiltjo Posthuma
On Mon, Jan 23, 2023 at 07:34:04AM +, Chris Down wrote:
> Thanks NRK for bringing this one up again :-)
> 
> NRK writes:
> > If there's any specific issue with Chris' patch which is blocking the
> > merge then I'd be happy to address them.
> 
> Likewise -- happy to act on any feedback.
> 
> I've been running this for the last half a year or so and haven't noticed
> any problems, so I suspect it's fine to go in mainline, but happy to
> discuss/address any concerns.
> 
> Thanks,
> 
> Chris
> 

Hi Chris,

I haven't run this patch for half a year and haven't noticed any issues either 
;)

Do you perhaps also have some simple way to reproduce where it causes issues in
some real world use-case?

Ideally some command or script to reproduce the issue.

-- 
Kind regards,
Hiltjo



Re: [hackers] Re: [dwm][PATCH v2] Use sigaction(SA_NOCLDWAIT) for SIGCHLD handling

2022-08-09 Thread Chris Down

NRK writes:

ping!

In case this got lost in the noise :)


Thanks for the poke :-)

From my side, I've been running with this patch for a couple of weeks and 
everything works fine, so I think it's good to go.