On 09/20/2011 09:47 AM, Pádraig Brady wrote:
> On 09/19/2011 12:48 PM, Jim Meyering wrote:
>> While working around the getcwd limitation on kFreeBSD,
>> I noticed that all timeout-using tests were hanging.
>> HAVE_TIMER_SETTIME was defined, so I confirmed that
>> disabling that definition made it so all tests now pass:
>>
>> diff --git a/src/timeout.c b/src/timeout.c
>> index d734e4e..c87d603 100644
>> --- a/src/timeout.c
>> +++ b/src/timeout.c
>> @@ -112,7 +112,7 @@ settimeout (double duration)
>> deprecated by POSIX. Instead we fallback to single second
>> resolution provided by alarm(). */
>>
>> -#if HAVE_TIMER_SETTIME
>> +#if 0
>> struct timespec ts = dtotimespec (duration);
>> struct itimerspec its = { {0, 0}, ts };
>> timer_t timerid;
>>
>> Obviously the above is just FYI.
>> The real change should probably be in configure-time code that tests
>> not just for existence, but also for working timer functions.
>
> It's not that simple unfortunately.
> What's happening there is that glibc is emulating
> these timer functions using an implicitly created thread
> (on any non Linux >= 2.6 kernel).
>
> So when timeout::cleanup() gets the SIGALRM
> and then does kill(0, SIGTERM), _two_ signals
> are sent, which causes it to go into a loop
> sending signals. (So do a `killall -9 timeout`
> on any kfreebsd systems you experienced the hang on.)
>
> Interestingly FreeBSD 8.2 has these timer functions
> implemented in the kernel, so perhaps GNU/kFreeBSD debian 8.1
> just needs better wiring up with glibc?
>
> In any case we should handle this, but I'm unsure how to proceed.
> This would work around it:
>
> diff --git a/src/timeout.c b/src/timeout.c
> index d734e4e..de7faad 100644
> --- a/src/timeout.c
> +++ b/src/timeout.c
> @@ -146,6 +146,7 @@ settimeout (double duration)
> static int
> send_sig (int where, int sig)
> {
> + signal (sig, SIG_IGN);
> sigs_to_ignore[sig] = 1;
> return kill (where, sig);
> }
>
> But I'm worried about blocking signals we send indefinitely,
> so I think we'll need another approach.
I can't think of anything better,
so attached is a full patch for the above method.
I'd prefer to avoid timer_settime() if it creates a thread,
but I don't know how to portably detect the presence of multiple threads.
An alternative to ignore signals we send to our own group is
attach a marker to the signal with sigqueue() and we
could then discard all such signals. But that functionality
is missing on many platforms and would complicate things anyway.
cheers,
Pádraig.
>From 9b8698bffa31802684e29cea6ce05123e0f79283 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= <[email protected]>
Date: Wed, 21 Sep 2011 15:22:52 +0100
Subject: [PATCH] timeout: handle implicitly created threads
On some systems like glibc on kFreeBSD, a thread is
implicitly created when timer_settime() is used.
This breaks our scheme to ignore signals we've
sent ourselves.
* src/timeout.c (send_sig): Change the scheme used to
ignore signals we've sent ourselves, to a more robust
but perhaps limited scheme of ignoring all signals of
a certain type after we've sent that signal to the job.
---
src/timeout.c | 20 +++++++++++---------
1 files changed, 11 insertions(+), 9 deletions(-)
diff --git a/src/timeout.c b/src/timeout.c
index d734e4e..59b937e 100644
--- a/src/timeout.c
+++ b/src/timeout.c
@@ -77,7 +77,6 @@
static int timed_out;
static int term_signal = SIGTERM; /* same default as kill command. */
static int monitored_pid;
-static int sigs_to_ignore[NSIG]; /* so monitor can ignore sigs it resends.
*/
static double kill_after;
static bool foreground; /* whether to use another program group. */
@@ -141,12 +140,20 @@ settimeout (double duration)
alarm (timeint);
}
-/* send sig to group but not ourselves.
- * FIXME: Is there a better way to achieve this? */
+/* send SIG avoiding the current process. */
+
static int
send_sig (int where, int sig)
{
- sigs_to_ignore[sig] = 1;
+ /* If sending to the group, then ignore the signal,
+ so we don't go into a signal loop. Note that this will ignore any of the
+ signals registered in install_signal_handlers(), that are sent after we
+ propagate the first one, which hopefully won't be an issue. Note this
+ process can be implicitly multithreaded due to some timer_settime()
+ implementations, therefore a signal sent to the group, can be sent
+ multiple times to this process. */
+ if (where == 0)
+ signal (sig, SIG_IGN);
return kill (where, sig);
}
@@ -160,11 +167,6 @@ cleanup (int sig)
}
if (monitored_pid)
{
- if (sigs_to_ignore[sig])
- {
- sigs_to_ignore[sig] = 0;
- return;
- }
if (kill_after)
{
/* Start a new timeout after which we'll send SIGKILL. */
--
1.7.6