Re: wait(2) and SIGCHLD
On Sat, Aug 15, 2020 at 07:57:26PM -0400, Terry Moore wrote: > >> I would say so, especially since that would mean the child's parent is > >> no longer the process that forked it (which could break other use > >> cases). > > > > That depends on how you implement detaching, but I suppose ultimately > > it's important for getppid() to revert to 1 at the point the parent > > exits (neither before, nor after, nor never) so some kind of linkage > > needs to remain. > > > > Bah. > > > > I guess it's time to invent yet another different interface to > > fork-and-really-detach. > > No time to experiment today, but from the descriptions it sounds as if a > double fork would work, > with the child exiting immediately after forking the grandchild? Kind of > unpleasant, but nothing > new needed? (For the record: yes, forking twice works, that's more or less the standard approach; but it's comparatively expensive.) -- David A. Holland dholl...@netbsd.org
Re: wait(2) and SIGCHLD
On Sat 15 Aug 2020 at 19:57:26 -0400, Terry Moore wrote: > David Holland wrote: > >> I would say so, especially since that would mean the child's parent is > > > no longer the process that forked it (which could break other use > >> cases). > > > > That depends on how you implement detaching, but I suppose ultimately > > it's important for getppid() to revert to 1 at the point the parent > > exits (neither before, nor after, nor never) so some kind of linkage > > needs to remain. > > > > Bah. > > > > I guess it's time to invent yet another different interface to > > fork-and-really-detach. > > No time to experiment today, but from the descriptions it sounds as if a > double fork would work, > with the child exiting immediately after forking the grandchild? Kind of > unpleasant, but nothing > new needed? My first thought was that daemon(3) does something like that already (the idea sounds familiar to me), but it does just a single fork(2) and a setsid(2). -Olaf. -- Olaf 'Rhialto' Seibert -- rhialto at falu dot nl ___ Anyone who is capable of getting themselves made President should on \X/ no account be allowed to do the job. --Douglas Adams, "THGTTG" signature.asc Description: PGP signature
Re: wait(2) and SIGCHLD
>>> but isn't what's supposed to happen when a child's parent is >>> ignoring SIGCHLD - the child should skip zombie state, and simply >>> be cleaned up. >> And how is "reparent to init" not an acceptable means of >> implementing that? > Acceptable or not, it would seem to not match our own documentation. Point. That manpage wording should be updated a little. >> I thought I'd seen some code that rendered init immune to SIGKILL >> and possibly SIGSTOP too [...] > SIGSTOP is one of two signals that a process supposedly should not be > able to intercept. Of course, init is special enough that normal > rules might not apply... Yes, the code I was thinking of was inside the kernel, where of course rules like that apply only insofar as the code chooses to let them. >> Right, they shouldn't be. But init shouldn't be stopped, either. >> Similarly, I think it should be impossible to ptrace init, [...] > How special do one really want init to be? As special as it needs to be. I'm not as confident now as I was when I wrote that that ptracing init should be impossible. I do think it should be possible to configure a system such that it's impossible, and that that should be the default. But, as someone who routinely goes under the hood, I think it could be very useful to be able to set a system up so that it's possible. As a data point: I booted a scratch system (4.0.1, because that's all I have on the most convenient scratch hardware), and neither "kill -STOP 1" nor "kill -KILL 1" had any effect visible to ps ax. I don't know where/how they're getting stopped, but they are. Mouse
Re: wait(2) and SIGCHLD
On 2020-08-16 21:17, Mouse wrote: They don't vanish, they get reparented to init(8) which then wakes up and reaps them. That probably would work, approximately, Well, it does work, to at least a first approximation. but isn't what's supposed to happen when a child's parent is ignoring SIGCHLD - the child should skip zombie state, and simply be cleaned up. And how is "reparent to init" not an acceptable means of implementing that? Acceptable or not, it would seem to not match our own documentation. From the sigaction() man-page: SA_NOCLDWAIT If set, the system will not create a zombie when the child exits, but the child process will be automatically waited for. The same effect can be achieved by setting the signal handler for SIGCHLD to SIG_IGN. The difference would be detectable if init were sent a SIGSTOP (assuming that isn't one which would cause a system panic) I don't think it would panic, but I think that, if it really does stop init, it's a bug that it does so. I thought I'd seen some code that rendered init immune to SIGKILL and possibly SIGSTOP too (maybe by forcing them into init's blocked-signals set? I forget). But I can't seem to find it now. SIGSTOP is one of two signals that a process supposedly should not be able to intercept. Of course, init is special enough that normal rules might not apply... so it would stop reaping children (temporarily) - processes of the type in question should not be showing up as zombies. Right, they shouldn't be. But init shouldn't be stopped, either. Similarly, I think it should be impossible to ptrace init, and I have a fuzzy memory that it was on at least one system I tried it on. I'll be poking around a bit more. How special do one really want init to be? Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: wait(2) and SIGCHLD
>> They don't vanish, they get reparented to init(8) which then wakes >> up and reaps them. > That probably would work, approximately, Well, it does work, to at least a first approximation. > but isn't what's supposed to happen when a child's parent is ignoring > SIGCHLD - the child should skip zombie state, and simply be cleaned > up. And how is "reparent to init" not an acceptable means of implementing that? > The difference would be detectable if init were sent a SIGSTOP > (assuming that isn't one which would cause a system panic) I don't think it would panic, but I think that, if it really does stop init, it's a bug that it does so. I thought I'd seen some code that rendered init immune to SIGKILL and possibly SIGSTOP too (maybe by forcing them into init's blocked-signals set? I forget). But I can't seem to find it now. > so it would stop reaping children (temporarily) - processes of the > type in question should not be showing up as zombies. Right, they shouldn't be. But init shouldn't be stopped, either. Similarly, I think it should be impossible to ptrace init, and I have a fuzzy memory that it was on at least one system I tried it on. I'll be poking around a bit more. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Re: wait(2) and SIGCHLD
In article <28808.1597602...@jinx.noi.kre.to>, Robert Elz wrote: >Date:Sun, 16 Aug 2020 16:13:57 - (UTC) >From:chris...@astron.com (Christos Zoulas) >Message-ID: > > | They don't vanish, they get reparented to init(8) which then wakes up > | and reaps them. > >That probably would work, approximately, but isn't what's supposed to >happen when a child's parent is ignoring SIGCHLD - the child should >skip zombie state, and simply be cleaned up. > >The difference would be detectable if init were sent a SIGSTOP >(assuming that isn't one which would cause a system panic) >so it would stop reaping children (temporarily) - processes of >the type in question should not be showing up as zombies. FreeBSD does what we do (reparent to init). Linux has autoreap which moves the state of the process to DEAD without going through ZOMBIE and adds it to the dead queue. christos
Re: wait(2) and SIGCHLD
Date:Sun, 16 Aug 2020 16:13:57 - (UTC) From:chris...@astron.com (Christos Zoulas) Message-ID: | They don't vanish, they get reparented to init(8) which then wakes up | and reaps them. That probably would work, approximately, but isn't what's supposed to happen when a child's parent is ignoring SIGCHLD - the child should skip zombie state, and simply be cleaned up. The difference would be detectable if init were sent a SIGSTOP (assuming that isn't one which would cause a system panic) so it would stop reaping children (temporarily) - processes of the type in question should not be showing up as zombies. kre
Re: wait(2) and SIGCHLD
In article <5919.1597441...@jinx.noi.kre.to>, Robert Elz wrote: >Date:Fri, 14 Aug 2020 20:01:18 +0200 >From:Edgar =?iso-8859-1?B?RnXf?= >Message-ID: <20200814180117.gq61...@trav.math.uni-bonn.de> > > | 3. I don't see where POSIX defines or allows this, but given 2., I'm surely > |missing something. > >It is specified to work this way in POSIX, though right now I don't >have the time to go dig out exactly where. > >Setting SIGCHLD to SIG_IGN effectively means that you want to ignore >your children - they then don't report any exit status to their parent, >but simply vanish when they exit. Thus when the parent does a wait() >it has no children, and gets ECHLD. They don't vanish, they get reparented to init(8) which then wakes up and reaps them. >Leave (or set) SIGCHLD to SIG_DFL and you don't get signals, but child >processes do report status to their parent. Catch SIGCHLD and you'll >get signalled when a child exits (I'm not sure if NetBSD guarantees one >signal delivery for each exited child or just a signal if there are >some unspecified number of exited children). > >The actions on an ignored SIGCHLD is SysV inherited behaviour, >Bell Labs (v7/32V) and CSRG BSD systems didn't act this way. Yup, I edded this: 1.199(christos 30-Mar-05): #define P_CLDSIGIGN 0x0008 /* Process is ignoring SIGCHLD */ christos
RE: wait(2) and SIGCHLD
David Holland wrote: >> I would say so, especially since that would mean the child's parent is > > no longer the process that forked it (which could break other use >> cases). > > That depends on how you implement detaching, but I suppose ultimately > it's important for getppid() to revert to 1 at the point the parent > exits (neither before, nor after, nor never) so some kind of linkage > needs to remain. > > Bah. > > I guess it's time to invent yet another different interface to > fork-and-really-detach. No time to experiment today, but from the descriptions it sounds as if a double fork would work, with the child exiting immediately after forking the grandchild? Kind of unpleasant, but nothing new needed? --Terry
Re: wait(2) and SIGCHLD
On Sat, Aug 15, 2020 at 07:24:01AM -0400, Mouse wrote: > >>> What I observe is that a process that explicitly ignores SIGCHLD > >>> (SIG_IGN), then forks a child which exits, when wait()ing for the > >>> child, gets ECHILD (i.e., wait returns -1 and errno is ECHILD). > >> And the ECHILD return is delayed until all children have terminated > > Huh, I hadn't realized (or expected) that. So I guess it's wrong to > > implement this by just detaching the child up front...? > > I would say so, especially since that would mean the child's parent is > no longer the process that forked it (which could break other use > cases). That depends on how you implement detaching, but I suppose ultimately it's important for getppid() to revert to 1 at the point the parent exits (neither before, nor after, nor never) so some kind of linkage needs to remain. Bah. I guess it's time to invent yet another different interface to fork-and-really-detach. > > I'm guessing also then that it's the signal setting when the child > > exits that matters; I had always thought it was the signal setting > > when the child was forked. > > Oh, interesting point. > > Yes, in a test I just did [...] Yup, me too. -- David A. Holland dholl...@netbsd.org
Re: wait(2) and SIGCHLD
>>> What I observe is that a process that explicitly ignores SIGCHLD >>> (SIG_IGN), then forks a child which exits, when wait()ing for the >>> child, gets ECHILD (i.e., wait returns -1 and errno is ECHILD). >> And the ECHILD return is delayed until all children have terminated > Huh, I hadn't realized (or expected) that. So I guess it's wrong to > implement this by just detaching the child up front...? I would say so, especially since that would mean the child's parent is no longer the process that forked it (which could break other use cases). > I'm guessing also then that it's the signal setting when the child > exits that matters; I had always thought it was the signal setting > when the child was forked. Oh, interesting point. Yes, in a test I just did on NetBSD 5.2, it is the signal setting when the child exits that matters - program below. (I tried it on 1.4T, but that's old enough it doesn't have the magic SIG_IGN semantic for SIGCHLD, not even when set before the child is forked.) I don't know whether POSIX specifies that or not. #include #include #include #include #include #include #include int main(void); int main(void) { pid_t kid; pid_t dead; fflush(0); kid = fork(); if (kid == 0) { printf("child sleeping\n"); sleep(5); printf("child exiting\n"); exit(0); } printf("child %d forked\n",(int)kid); printf("parent sleeping\n"); sleep(1); printf("parent ignoring SIGCHLD\n"); signal(SIGCHLD,SIG_IGN); while (1) { printf("parent waiting\n"); dead = wait(0); if (dead < 0) { if (errno == ECHILD) { printf("wait error: %s\n",strerror(ECHILD)); break; } else { printf("wait error: %s\n",strerror(errno)); } } else { printf("child %d reaped\n",dead); } } printf("parent exiting\n"); return(0); } > "signals are a semantic cesspool" That they are. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Re: wait(2) and SIGCHLD
On Fri, Aug 14, 2020 at 03:51:12PM -0400, Mouse wrote: > > What I observe is that a process that explicitly ignores SIGCHLD > > (SIG_IGN), then forks a child which exits, when wait()ing for the > > child, gets ECHILD (i.e., wait returns -1 and errno is ECHILD). > > And the ECHILD return is delayed until all children have terminated > (ie, until there are no outstanding children, until the ECHILD return > is accurate). That's important for some use cases. Huh, I hadn't realized (or expected) that. So I guess it's wrong to implement this by just detaching the child up front...? I'm guessing also then that it's the signal setting when the child exits that matters; I had always thought it was the signal setting when the child was forked. "signals are a semantic cesspool" -- David A. Holland dholl...@netbsd.org
Re: wait(2) and SIGCHLD
Date:Fri, 14 Aug 2020 20:01:18 +0200 From:Edgar =?iso-8859-1?B?RnXf?= Message-ID: <20200814180117.gq61...@trav.math.uni-bonn.de> | 3. I don't see where POSIX defines or allows this, but given 2., I'm surely |missing something. Actually, I did go take a look, it is in the XSH page for _Exit() under "Consequences of Process Termination" (some other places reference this section). kre
Re: wait(2) and SIGCHLD
Date:Fri, 14 Aug 2020 20:01:18 +0200 From:Edgar =?iso-8859-1?B?RnXf?= Message-ID: <20200814180117.gq61...@trav.math.uni-bonn.de> | 3. I don't see where POSIX defines or allows this, but given 2., I'm surely |missing something. It is specified to work this way in POSIX, though right now I don't have the time to go dig out exactly where. Setting SIGCHLD to SIG_IGN effectively means that you want to ignore your children - they then don't report any exit status to their parent, but simply vanish when they exit. Thus when the parent does a wait() it has no children, and gets ECHLD. Leave (or set) SIGCHLD to SIG_DFL and you don't get signals, but child processes do report status to their parent. Catch SIGCHLD and you'll get signalled when a child exits (I'm not sure if NetBSD guarantees one signal delivery for each exited child or just a signal if there are some unspecified number of exited children). The actions on an ignored SIGCHLD is SysV inherited behaviour, Bell Labs (v7/32V) and CSRG BSD systems didn't act this way. kre
Re: wait(2) and SIGCHLD
> What I observe is that a process that explicitly ignores SIGCHLD > (SIG_IGN), then forks a child which exits, when wait()ing for the > child, gets ECHILD (i.e., wait returns -1 and errno is ECHILD). And the ECHILD return is delayed until all children have terminated (ie, until there are no outstanding children, until the ECHILD return is accurate). That's important for some use cases. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Re: wait(2) and SIGCHLD
hello. I think Mouse said it best. There is a difference between SIG_DFL and SIG_IGN, which is how you can not get signaled when a child exists, but wait(2) will still wait for a child if you call it. Hope that helps. -Brian On Aug 14, 10:10am, Brian Buhrow wrote: } Subject: Re: wait(2) and SIGCHLD } Hello. I'm not sure I've completely understood your question, but I } think you're confusing the issue of whether a child posts a SIGCHLD signal } when it exits versus whether the current process that's calling wait(2) } receives a SIGCHLD when a child exits. The default behavior, as I } understand it, is that if a process has children, by default, it will not } get signaled if those children terminate. However, if that process then } calls wait(2), it will hang until a child terminates, regardless of whether } it's configured to receive the SIGCHLD or not. In that instance, I think } the man page is wrong, at least if code I have running is to be believed. So, } I think there's no difference between the default ignoring of the SIGCHLD } signal and explicitly ignoring it. } -Brian } } On Aug 14, 1:51pm, Edgar =?iso-8859-1?B?RnXf?= wrote: } } Subject: wait(2) and SIGCHLD } } I'm confused regarding the behaviour of wait(2) wrt. SIGCHLD handling. } } } } The wait(2) manpage says: } } } } wait() will fail and return immediately if: } } [ECHILD]The calling process has no existing unwaited-for child } } processes; or no status from the terminated child } } process is available because the calling process has } } asked the system to discard such status by ignoring } } the signal SIGCHLD or setting the flag SA_NOCLDWAIT } } for that signal. } } } } However, ignore is the default handler for SIGCHLD. } } } } So does the } } because the calling process has asked the system } } to discard such status by ignoring the signal SIGCHLD } } mean that explicitly ignoring SIGCHLD is different from ignoring it per default? } >-- End of excerpt from Edgar =?iso-8859-1?B?RnXf?= } } >-- End of excerpt from Brian Buhrow
Re: wait(2) and SIGCHLD
1. Sample program attached. Change SIG_IGN to SIG_DFL to see the difference. 2. macOS seems to behave the same way, as does Linux. 3. I don't see where POSIX defines or allows this, but given 2., I'm surely missing something. 4. The wording in wait(2) could be improved to clarify this is only about SIG_IGN, not SIG_DFL. At least, the NetBSD manpage mentions this at all. 5. Every time I think I knew Unix, I learn otherwise. #include #include #include #include #include #include int stat = 0; int ret; int main(int argc, char * argv[]) { signal(SIGCHLD, SIG_IGN); if (fork()) { if ((ret = wait()) < 0) err(1, "wait"); printf("ret %d, stat %d\n", ret, stat); } else { exit(42); } return 0; }
Re: wait(2) and SIGCHLD
> I'm not sure I've completely understood your question Probably not. Or I don't get what you are trying to say. What I observe is that a process that explicitly ignores SIGCHLD (SIG_IGN), then forks a child which exits, when wait()ing for the child, gets ECHILD (i.e., wait returns -1 and errno is ECHILD).
Re: wait(2) and SIGCHLD
Hello. I'm not sure I've completely understood your question, but I think you're confusing the issue of whether a child posts a SIGCHLD signal when it exits versus whether the current process that's calling wait(2) receives a SIGCHLD when a child exits. The default behavior, as I understand it, is that if a process has children, by default, it will not get signaled if those children terminate. However, if that process then calls wait(2), it will hang until a child terminates, regardless of whether it's configured to receive the SIGCHLD or not. In that instance, I think the man page is wrong, at least if code I have running is to be believed. So, I think there's no difference between the default ignoring of the SIGCHLD signal and explicitly ignoring it. -Brian On Aug 14, 1:51pm, Edgar =?iso-8859-1?B?RnXf?= wrote: } Subject: wait(2) and SIGCHLD } I'm confused regarding the behaviour of wait(2) wrt. SIGCHLD handling. } } The wait(2) manpage says: } } wait() will fail and return immediately if: } [ECHILD]The calling process has no existing unwaited-for child } processes; or no status from the terminated child } process is available because the calling process has } asked the system to discard such status by ignoring } the signal SIGCHLD or setting the flag SA_NOCLDWAIT } for that signal. } } However, ignore is the default handler for SIGCHLD. } } So does the } because the calling process has asked the system } to discard such status by ignoring the signal SIGCHLD } mean that explicitly ignoring SIGCHLD is different from ignoring it per default? >-- End of excerpt from Edgar =?iso-8859-1?B?RnXf?=
Re: wait(2) and SIGCHLD
The second question (that I forgot in the original mail) is whether wait(2) returning ECHILD for whatwever handling of SIGCHLD is covered by POSIX.
Re: wait(2) and SIGCHLD
> So does the > because the calling process has asked the system > to discard such status by ignoring the signal SIGCHLD > mean that explicitly ignoring SIGCHLD is different from ignoring it > per default? I wouldn't say it *means* that, exactly, but I do think that this is a case where SIG_IGN is different from SIG_DFL even when the default action is to do nothing. To put it another way, there are two concepts here, both of which are getting turned into tenses of "ignore" in English. One is a handler of SIG_IGN; the other is SIG_DFL with a default action of "do nothing". The wording in wait(2) is, I believe, talking about only the SIG_IGN kind of ignoring. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Re: wait(2) and SIGCHLD
I agree that this is confusing. While the system definitely differentiates between SIG_DFL and SIG_IGN, the difference would normally not be something I expected to make a difference in something described the way wait(2) is documented. I haven't really bothered going down into the code and find the answer, but I'm curious what other answers pops up for this one. Johnny On 2020-08-14 13:51, Edgar Fuß wrote: I'm confused regarding the behaviour of wait(2) wrt. SIGCHLD handling. The wait(2) manpage says: wait() will fail and return immediately if: [ECHILD]The calling process has no existing unwaited-for child processes; or no status from the terminated child process is available because the calling process has asked the system to discard such status by ignoring the signal SIGCHLD or setting the flag SA_NOCLDWAIT for that signal. However, ignore is the default handler for SIGCHLD. So does the because the calling process has asked the system to discard such status by ignoring the signal SIGCHLD mean that explicitly ignoring SIGCHLD is different from ignoring it per default? -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
wait(2) and SIGCHLD
I'm confused regarding the behaviour of wait(2) wrt. SIGCHLD handling. The wait(2) manpage says: wait() will fail and return immediately if: [ECHILD]The calling process has no existing unwaited-for child processes; or no status from the terminated child process is available because the calling process has asked the system to discard such status by ignoring the signal SIGCHLD or setting the flag SA_NOCLDWAIT for that signal. However, ignore is the default handler for SIGCHLD. So does the because the calling process has asked the system to discard such status by ignoring the signal SIGCHLD mean that explicitly ignoring SIGCHLD is different from ignoring it per default?