Re: pthread question
On 10/07/2015 12:53 PM, Stuart Henderson wrote: > if (pthread_kill(stat_thread, 0)) { pthread_kill sends the specified signal to the thread, but signal 0 just checks whether a signal can be sent and sends no signal.
Re: pthread question
> X-Virus-Scanned: by XS4ALL Virus Scanner > Date: Wed, 7 Oct 2015 12:47:35 +0100 > From: Stuart Henderson> > Thanks. And I suppose in the case of a more complex program it couldn't > be guaranteed that the thread ID hasn't been reused by another thread > elsewhere in the program after the first one has exited. > > So I think a better approach for the check_disk program would be to > pass in a pointer to a struct that includes a marker that the child > thread can set when it's finished, and check that from the main > thread, does that make sense? Yes, properly protected by a mutex and perhaps using a condition variable. Cheers, Mark
Re: pthread question
On 2015/10/07 11:52, Stuart Henderson wrote: > monitoring-plugins has a program that checks available space on partitions. > Before doing this it does a stat() to check that the requested directory > exists and is accessible. In their devel tree they have moved to doing > this stat() in a thread - commit log was "don't let check_disk hang on > hanging file systems". However this code doesn't work for us. > > I've attached a stripped-down test program based on their code that works > as expected on Linux but fails on OpenBSD. I can always patch to use the > non-pthread code, but I wondered if anyone has an idea what's up and > whether the bug is theirs or ours - is pthread_kill(thread, 0) working > as expected? > > $ make thread LDFLAGS=-lpthread > cc -O2 -pipe -lpthread -o thread thread.c > > $ ./thread > 4 > child > 3 > 2 > 1 > 0 > child thread did not return within 5s oops, I added some extra bits while I was playing around, the test program was meant to be this simpler version. #include #include #include #include #include void do_something(); void *child(void *); int main(int argc, char **argv) { do_something(); } void do_something() { pthread_t stat_thread; int done = 0; int timer = 5; struct timespec req, rem; req.tv_sec = 0; pthread_create(_thread, NULL, child, NULL); while (timer-- > 0) { printf("%u\n", timer); req.tv_nsec = 1000; nanosleep(, ); if (pthread_kill(stat_thread, 0)) { done = 1; break; } else { printf("e %u\n", errno); req.tv_nsec = 99000; nanosleep(, ); } } if (done == 1) { pthread_join(stat_thread, NULL); } else { pthread_detach(stat_thread); printf("child thread did not return within 5s\n"); } } void *child(void *in) { printf("child\n"); }
pthread question
monitoring-plugins has a program that checks available space on partitions. Before doing this it does a stat() to check that the requested directory exists and is accessible. In their devel tree they have moved to doing this stat() in a thread - commit log was "don't let check_disk hang on hanging file systems". However this code doesn't work for us. I've attached a stripped-down test program based on their code that works as expected on Linux but fails on OpenBSD. I can always patch to use the non-pthread code, but I wondered if anyone has an idea what's up and whether the bug is theirs or ours - is pthread_kill(thread, 0) working as expected? $ make thread LDFLAGS=-lpthread cc -O2 -pipe -lpthread -o thread thread.c $ ./thread 4 child 3 2 1 0 child thread did not return within 5s #include #include #include #include #include void do_something(); void *child(void *); int main(int argc, char **argv) { do_something(); } void do_something() { pthread_t stat_thread; int done = 0; int timer = 5; struct timespec req, rem; req.tv_sec = 0; pthread_create(_thread, NULL, child, NULL); while (timer-- > 0) { printf("%u\n", timer); req.tv_nsec = 1000; nanosleep(, ); if (pthread_kill(stat_thread, 0)) { done = 1; break; } else { printf("e %u\n", errno); req.tv_nsec = 99000; nanosleep(, ); } } if (done == 1) { pthread_join(stat_thread, NULL); } else { pthread_detach(stat_thread); printf("child thread did not return within 5s\n"); } } void *child(void *in) { struct timespec xeq, xem; xeq.tv_nsec = 959000; printf("child1\n"); nanosleep(, ); printf("child2\n"); }
Re: pthread question
Thanks. And I suppose in the case of a more complex program it couldn't be guaranteed that the thread ID hasn't been reused by another thread elsewhere in the program after the first one has exited. So I think a better approach for the check_disk program would be to pass in a pointer to a struct that includes a marker that the child thread can set when it's finished, and check that from the main thread, does that make sense?
Re: pthread question
> Date: Wed, 7 Oct 2015 11:53:32 +0100 > From: Stuart Henderson> > On 2015/10/07 11:52, Stuart Henderson wrote: > > monitoring-plugins has a program that checks available space on partitions. > > Before doing this it does a stat() to check that the requested directory > > exists and is accessible. In their devel tree they have moved to doing > > this stat() in a thread - commit log was "don't let check_disk hang on > > hanging file systems". However this code doesn't work for us. > > > > I've attached a stripped-down test program based on their code that works > > as expected on Linux but fails on OpenBSD. I can always patch to use the > > non-pthread code, but I wondered if anyone has an idea what's up and > > whether the bug is theirs or ours - is pthread_kill(thread, 0) working > > as expected? > > > > $ make thread LDFLAGS=-lpthread > > cc -O2 -pipe -lpthread -o thread thread.c > > > > $ ./thread > > 4 > > child > > 3 > > 2 > > 1 > > 0 > > child thread did not return within 5s My reading of the POSIX standard is that our implementation of pthread_kill(3) is correct and that the program's expectations are wrong. POSIX says that in the "General Information" section on threads: The lifetime of a thread ID ends after the thread terminates if it was created with the detachstate attribute set to PTHREAD_CREATE_DETACHED or if pthread_detach() or pthread_join() has been called for that thread. At the point where the program calls pthread_kill(), pthread_join() has not been called yet. So the thread ID is still "alive". The pthread_exit() page says: As in kill(), if sig is zero, error checking shall be performed but no signal shall actually be sent. The only "shall fail" is EINVAL for passing a bogus signal number. The "informative" RATIONALE section mentions an additional error condition: If an implementation detects use of a thread ID after the end of its lifetime, it is recommended that the function should fail and report an [ESRCH] error. But since the thread ID is still "alive", that doesn't apply. Also note that POSIX explicitly mentions kill() here. And for kill() POSIX says (again in the RATIONALE section): Existing implementations vary on the result of a kill() with pid indicating an inactive process (a terminated process that has not been waited for by its parent). Some indicate success on such a call (subject to permission checking), while others give an error of [ESRCH]. Since the definition of process lifetime in this volume of POSIX.1-2008 covers inactive processes, the [ESRCH] error as described is inappropriate in this case. In particular, this means that an application cannot have a parent process check for termination of a particular child with kill(). (Usually this is done with the null signal; this can be done reliably with waitpid().) Which strongly suggests that using pthread_kill(..., 0) on a non-detached, unjoined thread should not return ESRCH. > #include > #include > #include > #include > #include > > void do_something(); > void *child(void *); > > int > main(int argc, char **argv) > { > do_something(); > } > > void do_something() > { > pthread_t stat_thread; > int done = 0; > int timer = 5; > struct timespec req, rem; > > req.tv_sec = 0; > pthread_create(_thread, NULL, child, NULL); > while (timer-- > 0) { > printf("%u\n", timer); > req.tv_nsec = 1000; > nanosleep(, ); > if (pthread_kill(stat_thread, 0)) { > done = 1; > break; > } else { > printf("e %u\n", errno); > req.tv_nsec = 99000; > nanosleep(, ); > } > } > if (done == 1) { > pthread_join(stat_thread, NULL); > } else { > pthread_detach(stat_thread); > printf("child thread did not return within 5s\n"); > } > } > > void *child(void *in) > { > printf("child\n"); > }