On 09/20/2011 09:47 AM, Pádraig Brady wrote:
> On 09/19/2011 12:48 PM, Jim Meyering wrote:
>> While working around the getcwd limitation on kFreeBSD,
>> I noticed that all timeout-using tests were hanging.
>> HAVE_TIMER_SETTIME was defined, so I confirmed that
>> disabling that definition made it so all tests now pass:
>>
>> diff --git a/src/timeout.c b/src/timeout.c
>> index d734e4e..c87d603 100644
>> --- a/src/timeout.c
>> +++ b/src/timeout.c
>> @@ -112,7 +112,7 @@ settimeout (double duration)
>>     deprecated by POSIX.  Instead we fallback to single second
>>     resolution provided by alarm().  */
>>
>> -#if HAVE_TIMER_SETTIME
>> +#if 0
>>    struct timespec ts = dtotimespec (duration);
>>    struct itimerspec its = { {0, 0}, ts };
>>    timer_t timerid;
>>
>> Obviously the above is just FYI.
>> The real change should probably be in configure-time code that tests
>> not just for existence, but also for working timer functions.
> 
> It's not that simple unfortunately.
> What's happening there is that glibc is emulating
> these timer functions using an implicitly created thread
> (on any non Linux >= 2.6 kernel).
> 
> So when timeout::cleanup() gets the SIGALRM
> and then does kill(0, SIGTERM), _two_ signals
> are sent, which causes it to go into a loop
> sending signals.  (So do a `killall -9 timeout`
> on any kfreebsd systems you experienced the hang on.)
> 
> Interestingly FreeBSD 8.2 has these timer functions
> implemented in the kernel, so perhaps GNU/kFreeBSD debian 8.1
> just needs better wiring up with glibc?
> 
> In any case we should handle this, but I'm unsure how to proceed.
> This would work around it:
> 
> diff --git a/src/timeout.c b/src/timeout.c
> index d734e4e..de7faad 100644
> --- a/src/timeout.c
> +++ b/src/timeout.c
> @@ -146,6 +146,7 @@ settimeout (double duration)
>  static int
>  send_sig (int where, int sig)
>  {
> +  signal (sig, SIG_IGN);
>    sigs_to_ignore[sig] = 1;
>    return kill (where, sig);
>  }
> 
> But I'm worried about blocking signals we send indefinitely,
> so I think we'll need another approach.

I can't think of anything better,
so attached is a full patch for the above method.

I'd prefer to avoid timer_settime() if it creates a thread,
but I don't know how to portably detect the presence of multiple threads.
An alternative to ignore signals we send to our own group is
attach a marker to the signal with sigqueue() and we
could then discard all such signals. But that functionality
is missing on many platforms and would complicate things anyway.

cheers,
Pádraig.
>From 9b8698bffa31802684e29cea6ce05123e0f79283 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= <[email protected]>
Date: Wed, 21 Sep 2011 15:22:52 +0100
Subject: [PATCH] timeout: handle implicitly created threads

On some systems like glibc on kFreeBSD, a thread is
implicitly created when timer_settime() is used.
This breaks our scheme to ignore signals we've
sent ourselves.

* src/timeout.c (send_sig): Change the scheme used to
ignore signals we've sent ourselves, to a more robust
but perhaps limited scheme of ignoring all signals of
a certain type after we've sent that signal to the job.
---
 src/timeout.c |   20 +++++++++++---------
 1 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/src/timeout.c b/src/timeout.c
index d734e4e..59b937e 100644
--- a/src/timeout.c
+++ b/src/timeout.c
@@ -77,7 +77,6 @@
 static int timed_out;
 static int term_signal = SIGTERM;  /* same default as kill command.  */
 static int monitored_pid;
-static int sigs_to_ignore[NSIG];   /* so monitor can ignore sigs it resends.  
*/
 static double kill_after;
 static bool foreground;            /* whether to use another program group.  */
 
@@ -141,12 +140,20 @@ settimeout (double duration)
   alarm (timeint);
 }
 
-/* send sig to group but not ourselves.
- * FIXME: Is there a better way to achieve this?  */
+/* send SIG avoiding the current process.  */
+
 static int
 send_sig (int where, int sig)
 {
-  sigs_to_ignore[sig] = 1;
+  /* If sending to the group, then ignore the signal,
+     so we don't go into a signal loop.  Note that this will ignore any of the
+     signals registered in install_signal_handlers(), that are sent after we
+     propagate the first one, which hopefully won't be an issue.  Note this
+     process can be implicitly multithreaded due to some timer_settime()
+     implementations, therefore a signal sent to the group, can be sent
+     multiple times to this process.  */
+  if (where == 0)
+    signal (sig, SIG_IGN);
   return kill (where, sig);
 }
 
@@ -160,11 +167,6 @@ cleanup (int sig)
     }
   if (monitored_pid)
     {
-      if (sigs_to_ignore[sig])
-        {
-          sigs_to_ignore[sig] = 0;
-          return;
-        }
       if (kill_after)
         {
           /* Start a new timeout after which we'll send SIGKILL.  */
-- 
1.7.6

Reply via email to