Re: pthread question

2015-10-07 Thread Jonas 'Sortie' Termansen
On 10/07/2015 12:53 PM, Stuart Henderson wrote:
>   if (pthread_kill(stat_thread, 0)) {
pthread_kill sends the specified signal to the thread, but signal 0 just
checks whether a signal can be sent and sends no signal.



Re: pthread question

2015-10-07 Thread Mark Kettenis
> X-Virus-Scanned: by XS4ALL Virus Scanner
> Date: Wed, 7 Oct 2015 12:47:35 +0100
> From: Stuart Henderson 
> 
> Thanks. And I suppose in the case of a more complex program it couldn't
> be guaranteed that the thread ID hasn't been reused by another thread
> elsewhere in the program after the first one has exited.
> 
> So I think a better approach for the check_disk program would be to
> pass in a pointer to a struct that includes a marker that the child
> thread can set when it's finished, and check that from the main
> thread, does that make sense?

Yes, properly protected by a mutex and perhaps using a condition
variable.

Cheers,

Mark



Re: pthread question

2015-10-07 Thread Stuart Henderson
On 2015/10/07 11:52, Stuart Henderson wrote:
> monitoring-plugins has a program that checks available space on partitions.
> Before doing this it does a stat() to check that the requested directory
> exists and is accessible. In their devel tree they have moved to doing
> this stat() in a thread - commit log was "don't let check_disk hang on
> hanging file systems". However this code doesn't work for us.
> 
> I've attached a stripped-down test program based on their code that works
> as expected on Linux but fails on OpenBSD. I can always patch to use the
> non-pthread code, but I wondered if anyone has an idea what's up and
> whether the bug is theirs or ours - is pthread_kill(thread, 0) working
> as expected?
> 
> $ make thread LDFLAGS=-lpthread
> cc -O2 -pipe   -lpthread -o thread thread.c
> 
> $ ./thread
> 4
> child
> 3
> 2
> 1
> 0
> child thread did not return within 5s

oops, I added some extra bits while I was playing around, the test
program was meant to be this simpler version.
#include 
#include 
#include 
#include 
#include 

void do_something();
void *child(void *);

int
main(int argc, char **argv)
{
do_something();
}

void do_something()
{
pthread_t stat_thread;
int done = 0;
int timer = 5;
struct timespec req, rem;

req.tv_sec = 0;
pthread_create(_thread, NULL, child, NULL);
while (timer-- > 0) {
printf("%u\n", timer);
req.tv_nsec = 1000;
nanosleep(, );
if (pthread_kill(stat_thread, 0)) {
done = 1;
break;
} else {
printf("e %u\n", errno);
req.tv_nsec = 99000;
nanosleep(, );
}
}
if (done == 1) {
pthread_join(stat_thread, NULL);
} else {
pthread_detach(stat_thread);
printf("child thread did not return within 5s\n");
}
}

void *child(void *in)
{
printf("child\n");
}


pthread question

2015-10-07 Thread Stuart Henderson
monitoring-plugins has a program that checks available space on partitions.
Before doing this it does a stat() to check that the requested directory
exists and is accessible. In their devel tree they have moved to doing
this stat() in a thread - commit log was "don't let check_disk hang on
hanging file systems". However this code doesn't work for us.

I've attached a stripped-down test program based on their code that works
as expected on Linux but fails on OpenBSD. I can always patch to use the
non-pthread code, but I wondered if anyone has an idea what's up and
whether the bug is theirs or ours - is pthread_kill(thread, 0) working
as expected?

$ make thread LDFLAGS=-lpthread
cc -O2 -pipe   -lpthread -o thread thread.c

$ ./thread
4
child
3
2
1
0
child thread did not return within 5s

#include 
#include 
#include 
#include 
#include 

void do_something();
void *child(void *);

int
main(int argc, char **argv)
{
do_something();
}

void do_something()
{
pthread_t stat_thread;
int done = 0;
int timer = 5;
struct timespec req, rem;

req.tv_sec = 0;
pthread_create(_thread, NULL, child, NULL);
while (timer-- > 0) {
printf("%u\n", timer);
req.tv_nsec = 1000;
nanosleep(, );
if (pthread_kill(stat_thread, 0)) {
done = 1;
break;
} else {
printf("e %u\n", errno);
req.tv_nsec = 99000;
nanosleep(, );
}
}
if (done == 1) {
pthread_join(stat_thread, NULL);
} else {
pthread_detach(stat_thread);
printf("child thread did not return within 5s\n");
}
}

void *child(void *in)
{
struct timespec xeq, xem;
xeq.tv_nsec = 959000;
printf("child1\n");
nanosleep(, );
printf("child2\n");
}


Re: pthread question

2015-10-07 Thread Stuart Henderson
Thanks. And I suppose in the case of a more complex program it couldn't
be guaranteed that the thread ID hasn't been reused by another thread
elsewhere in the program after the first one has exited.

So I think a better approach for the check_disk program would be to
pass in a pointer to a struct that includes a marker that the child
thread can set when it's finished, and check that from the main
thread, does that make sense?



Re: pthread question

2015-10-07 Thread Mark Kettenis
> Date: Wed, 7 Oct 2015 11:53:32 +0100
> From: Stuart Henderson 
> 
> On 2015/10/07 11:52, Stuart Henderson wrote:
> > monitoring-plugins has a program that checks available space on partitions.
> > Before doing this it does a stat() to check that the requested directory
> > exists and is accessible. In their devel tree they have moved to doing
> > this stat() in a thread - commit log was "don't let check_disk hang on
> > hanging file systems". However this code doesn't work for us.
> > 
> > I've attached a stripped-down test program based on their code that works
> > as expected on Linux but fails on OpenBSD. I can always patch to use the
> > non-pthread code, but I wondered if anyone has an idea what's up and
> > whether the bug is theirs or ours - is pthread_kill(thread, 0) working
> > as expected?
> > 
> > $ make thread LDFLAGS=-lpthread
> > cc -O2 -pipe   -lpthread -o thread thread.c
> > 
> > $ ./thread
> > 4
> > child
> > 3
> > 2
> > 1
> > 0
> > child thread did not return within 5s

My reading of the POSIX standard is that our implementation of
pthread_kill(3) is correct and that the program's expectations are
wrong.

POSIX says that in the "General Information" section on threads:

  The lifetime of a thread ID ends after the thread terminates if it
  was created with the detachstate attribute set to
  PTHREAD_CREATE_DETACHED or if pthread_detach() or pthread_join() has
  been called for that thread.

At the point where the program calls pthread_kill(), pthread_join()
has not been called yet.  So the thread ID is still "alive".

The pthread_exit() page says:

  As in kill(), if sig is zero, error checking shall be performed but
  no signal shall actually be sent.

The only "shall fail" is EINVAL for passing a bogus signal number.
The "informative" RATIONALE section mentions an additional error
condition:

  If an implementation detects use of a thread ID after the end of its
  lifetime, it is recommended that the function should fail and report
  an [ESRCH] error.

But since the thread ID is still "alive", that doesn't apply.

Also note that POSIX explicitly mentions kill() here.  And for kill()
POSIX says (again in the RATIONALE section):

  Existing implementations vary on the result of a kill() with pid
  indicating an inactive process (a terminated process that has not
  been waited for by its parent). Some indicate success on such a call
  (subject to permission checking), while others give an error of
  [ESRCH]. Since the definition of process lifetime in this volume of
  POSIX.1-2008 covers inactive processes, the [ESRCH] error as
  described is inappropriate in this case. In particular, this means
  that an application cannot have a parent process check for
  termination of a particular child with kill(). (Usually this is done
  with the null signal; this can be done reliably with waitpid().)

Which strongly suggests that using pthread_kill(..., 0) on a
non-detached, unjoined thread should not return ESRCH.

> #include 
> #include 
> #include 
> #include 
> #include 
> 
> void do_something();
> void *child(void *);
> 
> int
> main(int argc, char **argv)
> {
>   do_something();
> }
> 
> void do_something()
> {
>   pthread_t stat_thread;
>   int done = 0;
>   int timer = 5;
>   struct timespec req, rem;
> 
>   req.tv_sec = 0;
>   pthread_create(_thread, NULL, child, NULL);
>   while (timer-- > 0) {
>   printf("%u\n", timer);
>   req.tv_nsec = 1000;
>   nanosleep(, );
>   if (pthread_kill(stat_thread, 0)) {
>   done = 1;
>   break;
>   } else {
>   printf("e %u\n", errno);
>   req.tv_nsec = 99000;
>   nanosleep(, );
>   }
>   }
>   if (done == 1) {
>   pthread_join(stat_thread, NULL);
>   } else {
>   pthread_detach(stat_thread);
>   printf("child thread did not return within 5s\n");
>   }
> }
> 
> void *child(void *in)
> {
>   printf("child\n");
> }