Re: FYI: problem with wait(1)
Thorsten Glaser t...@mirbsd.de wrote: |Steffen Daode Nurpmeso dixit: | |A brute force thing could be to patch the shell to run j_sigchld() | |every time the job control builtins and kill are run (this will | |still miss some, like async notify when /bin/kill is used instead). | |I think this is what the other shells (bash(1) and ksh(1) that is) |necessarily do, but i'd agree it's a flaky approach of doing |things and as such not worth the effort. (I.e. far less than | |Hm. I slept over it, and doing a waitpid run in two more places, |– right before the jobs and maybe wait builtins, and |– just after a command was run, before calling edit.c, |sounds doable. | |Ideas? Or is this crazy? I think i've misread what you said and also misunderstand; i haven't looked at the other shells, but i think what they (necessarily) do due to lack of a true WIFCONTINUED() is that they track a «logical» state-change, as in (1) you STOP it, it's stopped, (2) you CONTinue it, it's running again. In doing so they nicely workaround a non-functioning WIFCONT but of course miss updates that may happen via whatever kill that is not builtin to the managing shell. I indeed thought that mksh(1) doesn't do the manual tracking because it doesn't work if bypassed like that, and i do think that is the better thing to do, because it is *true*; maybe it could be documented so that a «normal» user gets it right (as it's not worse than e.g. bash(1), only true), however. I'm afraid you're not willing to accept a documentation patch from my side… |bye, |//mirabilos --steffen ---BeginMessage--- Steffen Daode Nurpmeso dixit: |A brute force thing could be to patch the shell to run j_sigchld() |every time the job control builtins and kill are run (this will |still miss some, like async notify when /bin/kill is used instead). I think this is what the other shells (bash(1) and ksh(1) that is) necessarily do, but i'd agree it's a flaky approach of doing things and as such not worth the effort. (I.e. far less than Hm. I slept over it, and doing a waitpid run in two more places, – right before the jobs and maybe wait builtins, and – just after a command was run, before calling edit.c, sounds doable. Ideas? Or is this crazy? bye, //mirabilos -- 20:49⎜«Natureshadow» Oops, jetzt hab ich mir doch glatt beim Trinken ⎜Mineralwasser ins Ohr gekippt… 21:04⎜«mirabilos» ist das siggbar? █ PS: سمَـَّوُوُحخ ̷̴خ ̷̴خ ̷̴خ امارتيخ ̷̴خ 21:05⎜«Natureshadow» mirabilos: was sollte dich davon abhalten… ---End Message---
Re: FYI: problem with wait(1)
Steffen Daode Nurpmeso dixit: I think i've misread what you said and also misunderstand; […] I do not understand your mail at all… sorry but I think we have a language or conceptual barrier here? bye, //mirabilos -- 20:49⎜«Natureshadow» Oops, jetzt hab ich mir doch glatt beim Trinken ⎜Mineralwasser ins Ohr gekippt… 21:04⎜«mirabilos» ist das siggbar? █ PS: سمَـَّوُوُحخ ̷̴خ ̷̴خ ̷̴خ امارتيخ ̷̴خ 21:05⎜«Natureshadow» mirabilos: was sollte dich davon abhalten…
Re: FYI: problem with wait(1)
Todd C. Miller dixit: I don't think there is a way for a process to be notified via SIGCHLD when a child process receives SIGCONT. So unless you are already in waitpid() with WCONTINUED set you won't see it. OK, thanks for confirming though. bye, //mirabilos -- I believe no one can invent an algorithm. One just happens to hit upon it when God enlightens him. Or only God invents algorithms, we merely copy them. If you don't believe in God, just consider God as Nature if you won't deny existence. -- Coywolf Qi Hunt
Re: FYI: problem with wait(1)
Thorsten Glaser t...@mirbsd.de wrote: |Steffen Daode Nurpmeso dixit: | ||?0[steffen@sherwood]$ kill -CONT %1 || ||The kernel does not communicate this to the shell, ||so it assumes the job is still stopped and thus ||out of job control. If you “bg”, it should™ work. | |I wonder wether the simple patch below (beside its uglyness) would |be sufficient to deal with the problem? It fixes the particular |problem, but that's all i can say for sure… | |Hm. | |I had not heard about WCONTINUED, and Unix signal handling heh! It's in your manuals (just checked)! Now i wonder why it's not already used in the code. |is still some sort of black magic to me in many cases… but I do hate it (i think that's why they're there). Same for exceptions, btw :) |if you run with that patch for a few weeks and don’t notice |any problems I’ll include it. Or if we get an opinion from Only the version test fails. But i'll do so now and do some messing around next week, and will report any problems i encounter. I mean, if it doesn't mess up mksh(1)s job handling (?), then it's a cheap and simple way to stay in touch with child processes, is it? And it also shouldn't hurt if WCONTINUED is defined but doesn't work, as on Mac OS X Snow Leopard. |someone who Knows™ (oksh developers come to mind…). OpenBSD's ksh(1) doesn't seem to make use of WCONTINUED, even though WCONTINUED has been implemented by millert@ on 2003-08-03 (according to git(1) log). In fact, testing the patched mksh(1) under OpenBSD 5.4 doesn't get the event right… Ouch! |In any case, thanks for digging this up and even proposing yup (staggered … and stumbled over it). |a patch! (Let’s just hope it’s a good patch.) In ten years from now i'll make the «Poul-Henning» (as in „you can read my name so-and-so-many-times”). Wow! (Just kidding.) Ciao, |bye, |//mirabilos --steffen
Re: FYI: problem with wait(1)
Thorsten Glaser t...@mirbsd.de wrote: |Steffen Daode Nurpmeso dixit: | |?1[steffen@sherwood]$ kill -STOP %1 | |Why STOP and not TSTP? (Not related, but curious.) yeah, sorry – NetBSD bin/48138 and FreeBSD bin/181435 do mention TSTP… |?0[steffen@sherwood]$ kill -CONT %1 | |The kernel does not communicate this to the shell, |so it assumes the job is still stopped and thus |out of job control. If you “bg”, it should™ work. I wonder wether the simple patch below (beside its uglyness) would be sufficient to deal with the problem? It fixes the particular problem, but that's all i can say for sure… |bye, |//mirabilos Ciao, --steffen Date: 2013-08-21 14:07:40 +0200 Dumb try to add WCONTINUED support --- jobs.c | 15 +-- 1 file changed, 13 insertions(+), 2 deletions(-) diff --git a/jobs.c b/jobs.c index 3277c78..55b9000 100644 --- a/jobs.c +++ b/jobs.c @@ -1281,7 +1281,11 @@ j_sigchld(int sig MKSH_A_UNUSED) getrusage(RUSAGE_CHILDREN, ru0); do { #ifndef MKSH_NOPROSPECTOFWORK - pid = waitpid(-1, status, (WNOHANG|WUNTRACED)); + pid = waitpid(-1, status, (WNOHANG|WUNTRACED +# ifdef WCONTINUED + |WCONTINUED +# endif + )); #else pid = wait(status); #endif @@ -1317,7 +1321,14 @@ j_sigchld(int sig MKSH_A_UNUSED) ru0 = ru1; p-status = status; #ifndef MKSH_UNEMPLOYED - if (WIFSTOPPED(status)) +# ifdef WCONTINUED + if (WIFCONTINUED(status)) { + p-state = j-state = PRUNNING; + /* skip check_job(), no-op in this case */ + continue; + } else +# endif + if (WIFSTOPPED(status)) p-state = PSTOPPED; else #endif
FYI: problem with wait(1)
Hi Thorsten, i've just detected something that may interest you. I wanted to start something after a background command finished, but wait(1) didn't work. Turned out that the reason was that i've used signals to temporarily stop that background process. The good news is that only ksh93 (?) and bash(1) get that „right“, i.e., the way i expect it. mksh(1): ?0[steffen@sherwood]$ /bin/sleep 30 [1] 43732 ?0[steffen@sherwood]$ wait %1 ^C ?130[steffen@sherwood]$ jobs [1] + Running /bin/sleep 30 ?1[steffen@sherwood]$ kill -STOP %1 [1] + Suspended (signal) /bin/sleep 30 ?0[steffen@sherwood]$ jobs [1] + Suspended (signal) /bin/sleep 30 ?0[steffen@sherwood]$ wait %1 ?0[steffen@sherwood]$ kill -CONT %1 ?0[steffen@sherwood]$ jobs [1] + Suspended (signal) /bin/sleep 30 ?0[steffen@sherwood]$ wait %1 ?0[steffen@sherwood]$ wait ?0[steffen@sherwood]$ jobs [1] + Suspended (signal) /bin/sleep 30 ?0[steffen@sherwood]$ jobs [1] + Done /bin/sleep 30 dash(1): * [steffen@sherwood]$ /bin/sleep 30 * [steffen@sherwood]$ jobs [1] + Running/bin/sleep 30 * [steffen@sherwood]$ wait %1 ^C * [steffen@sherwood]$ jobs [1] + Running/bin/sleep 30 * [steffen@sherwood]$ kill -TSTP %1 [1] + Suspended /bin/sleep 30 * [steffen@sherwood]$ wait %1 * [steffen@sherwood]$ jobs [1] + Suspended /bin/sleep 30 * [steffen@sherwood]$ kill -CONT %1 * [steffen@sherwood]$ jobs [1] + Suspended /bin/sleep 30 * [steffen@sherwood]$ wait %1 * [steffen@sherwood]$ wait * [steffen@sherwood]$ jobs [1] + Suspended /bin/sleep 30 * [steffen@sherwood]$ [1] + Done /bin/sleep 30 * [steffen@sherwood]$ ^D NetBSD /bin/sh(1): * [steffen@nhead]$ /bin/sleep 30 * [steffen@nhead]$ jobs [1] + Running /bin/sleep 30 * [steffen@nhead]$ kill -STOP %1 * [steffen@nhead]$ jobs [1] + Running /bin/sleep 30 * [steffen@nhead]$ wait %1 [1] + Suspended (signal) /bin/sleep 30 * [steffen@nhead]$ kill -CONT %1 * [steffen@nhead]$ jobs [1] + Suspended (signal) /bin/sleep 30 * [steffen@nhead]$ wait %1 * [steffen@nhead]$ wait * [steffen@nhead]$ jobs [1] + Suspended (signal) /bin/sleep 30 * [steffen@nhead]$ [1] Done/bin/sleep 30 * [steffen@nhead]$ bash(1): ?0[steffen@sherwood]$ /bin/sleep 30 [1] 43752 ?0[steffen@sherwood]$ wait %1 ^C ?1[steffen@sherwood]$ kill -STOP %1 ?0[steffen@sherwood]$ jobs [1]+ Running /bin/sleep 30 ?0[steffen@sherwood]$ wait %1 [1]+ Stopped /bin/sleep 30 ?145[steffen@sherwood]$ wait %1 bash: warning: wait_for_job: job 1 is stopped ?145[steffen@sherwood]$ kill -CONT %1 ?0[steffen@sherwood]$ jobs [1]+ Running /bin/sleep 30 ?0[steffen@sherwood]$ wait %1 ^C ?1[steffen@sherwood]$ jobs [1]+ Running /bin/sleep 30 ?0[steffen@sherwood]$ [1]+ Done/bin/sleep 30 Mac OS X /bin/ksh(1): ?0[steffen@sherwood]$ /bin/sleep 30 [1] 43762 ?0[steffen@sherwood]$ jobs [1] + Running /bin/sleep 30 ?0[steffen@sherwood]$ wait ^C?258[steffen@sherwood]$ jobs [1] + Running /bin/sleep 30 ?0[steffen@sherwood]$ kill -STOP %1 ?0[steffen@sherwood]$ jobs [1] + Running /bin/sleep 30 ?0[steffen@sherwood]$ wait %1 [1] + Stopped (SIGSTOP)/bin/sleep 30 ?0[steffen@sherwood]$ kill -CONT %1 ?0[steffen@sherwood]$ jobs [1] + Running /bin/sleep 30 ?0[steffen@sherwood]$ wait %1 ^C?258[steffen@sherwood]$ jobs ?0[steffen@sherwood]$ Ciao, --steffen
Re: FYI: problem with wait(1)
Steffen Daode Nurpmeso dixit: ?1[steffen@sherwood]$ kill -STOP %1 Why STOP and not TSTP? (Not related, but curious.) ?0[steffen@sherwood]$ kill -CONT %1 The kernel does not communicate this to the shell, so it assumes the job is still stopped and thus out of job control. If you “bg”, it should™ work. bye, //mirabilos -- mirabilos│ untested Natureshadow │ tut natürlich Natureshadow │ was auch sonst ... mirabilos│ fijn ☺