Bug#966692: stunnel4: FTBFS because of test hang
On Thu, Aug 06, 2020 at 03:23:08AM +0200, Michał Mirosław wrote: > On Thu, Aug 06, 2020 at 03:10:55AM +0200, Michał Mirosław wrote: > > On Thu, Aug 06, 2020 at 02:39:42AM +0200, Michał Mirosław wrote: > > > On Thu, Aug 06, 2020 at 03:16:35AM +0300, Peter Pentchev wrote: > > > > On Thu, Aug 06, 2020 at 12:48:10AM +0200, Michał Mirosław wrote: > > > > > On Thu, Aug 06, 2020 at 12:29:36AM +0300, Peter Pentchev wrote: > > > > > > On Wed, Aug 05, 2020 at 10:52:31PM +0200, Michał Mirosław wrote: > > > > > [...] > > > > > > > Using print-debugging, I see that it stops at wait_for_child line > > > > > > > just > > > > > > > after printing the version. It seems that something is reaping > > > > > > > the child > > > > > > > before the main thread has a chance to wait for it. > > > > > > > > > > > > OK, so the only thing that comes to my mind now is that you may be > > > > > > hitting a crazy, crazy race between register_child() and > > > > > > child_reaper(), > > > > > > and I say "a crazy, crazy race", because the test has to (apparently > > > > > > reproducibly) receive the CHLD signal exactly between the check and > > > > > > the creation in register_child()'s first "$children{...} //= ...cv" > > > > > > statement. > > > > > > > > > > Well, there is nothing that prevents SIGCHLD arriving between fork() > > > > > and > > > > > register_child(). You could test this with more confidence (though not > > > > > 100%-reliably) by putting 'exit 1' just at the start of ($pid == 0) > > > > > branch. > > > > > > > > Nah, the problem is not just "between fork() and register_child()". > > > > It really must arrive at a very specific moment in time, because > > > > the //= operations for setting $children{$pid}{cv} try to make sure that > > > > a new value is not set (that is, a new condition variable is not > > > > created) if there already is such an element in the array. So the race > > > > is indeed between the //= in register_child() and the //= in > > > > child_reaper() - that is, child_reaper() must be invoked (SIGCHLD must > > > > arrive) *during* the execution of the //= in register_child(). > > > > > > > > Unless I'm missing something, which is not at all out of the question :) > > > > > > The assignment seems not to be at fault (see last strace). I don't know > > > perl's > > > internals enough to say if this statement can be interrupted visibly by a > > > signal > > > handler (I would guess not a perl handler, though). There are two wait4() > > > calls > > > even before child_reaper has a chance to run. > > > > Another data point: this happens only with anyevent + libev and not with > > anyevent + libevent. The first is preferred and installed by default with > > libanyevent-perl, though. And this is why I could not reproduce it until now - I've always (well, okay, ever since it was introduced, I'm a bit older than that) had apt *not* automatically install recommended packages... And here I thought I was going crazy... thanks, now it's, mm, let's say easier to reproduce! > AnyEvent's doc [1] mentions that the framework installs (or just might?) it's > own SIGCHLD handler. Maybe there are just too many handlers for SIGCHLD? Aaand this is why I should never be let near a keyboard... So how many years have I been doing Perl programming now?... and I managed to forget about AnyEvent installing its own SIGCHLD handler? Great. Just great. Thanks an *awful* lot for your perseverance, your analysis, and basically doing my own debugging work for me! Expect another patch soon. G'luck, Peter -- Peter Pentchev r...@ringlet.net r...@debian.org p...@storpool.com PGP key:http://people.FreeBSD.org/~roam/roam.key.asc Key fingerprint 2EE7 A7A5 17FC 124C F115 C354 651E EFB0 2527 DF13 signature.asc Description: PGP signature
Bug#966692: stunnel4: FTBFS because of test hang
On Thu, Aug 06, 2020 at 03:10:55AM +0200, Michał Mirosław wrote: > On Thu, Aug 06, 2020 at 02:39:42AM +0200, Michał Mirosław wrote: > > On Thu, Aug 06, 2020 at 03:16:35AM +0300, Peter Pentchev wrote: > > > On Thu, Aug 06, 2020 at 12:48:10AM +0200, Michał Mirosław wrote: > > > > On Thu, Aug 06, 2020 at 12:29:36AM +0300, Peter Pentchev wrote: > > > > > On Wed, Aug 05, 2020 at 10:52:31PM +0200, Michał Mirosław wrote: > > > > [...] > > > > > > Using print-debugging, I see that it stops at wait_for_child line > > > > > > just > > > > > > after printing the version. It seems that something is reaping the > > > > > > child > > > > > > before the main thread has a chance to wait for it. > > > > > > > > > > OK, so the only thing that comes to my mind now is that you may be > > > > > hitting a crazy, crazy race between register_child() and > > > > > child_reaper(), > > > > > and I say "a crazy, crazy race", because the test has to (apparently > > > > > reproducibly) receive the CHLD signal exactly between the check and > > > > > the creation in register_child()'s first "$children{...} //= ...cv" > > > > > statement. > > > > > > > > Well, there is nothing that prevents SIGCHLD arriving between fork() and > > > > register_child(). You could test this with more confidence (though not > > > > 100%-reliably) by putting 'exit 1' just at the start of ($pid == 0) > > > > branch. > > > > > > Nah, the problem is not just "between fork() and register_child()". > > > It really must arrive at a very specific moment in time, because > > > the //= operations for setting $children{$pid}{cv} try to make sure that > > > a new value is not set (that is, a new condition variable is not > > > created) if there already is such an element in the array. So the race > > > is indeed between the //= in register_child() and the //= in > > > child_reaper() - that is, child_reaper() must be invoked (SIGCHLD must > > > arrive) *during* the execution of the //= in register_child(). > > > > > > Unless I'm missing something, which is not at all out of the question :) > > > > The assignment seems not to be at fault (see last strace). I don't know > > perl's > > internals enough to say if this statement can be interrupted visibly by a > > signal > > handler (I would guess not a perl handler, though). There are two wait4() > > calls > > even before child_reaper has a chance to run. > > Another data point: this happens only with anyevent + libev and not with > anyevent + libevent. The first is preferred and installed by default with > libanyevent-perl, though. > > $ dpkg -l libanyevent-perl libev-perl | cat > Desired=Unknown/Install/Remove/Purge/Hold > | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend > |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) > ||/ Name Version Architecture Description > +++---- > ii libanyevent-perl 7.140-3 amd64event loop framework with > multiple implementations > ii libev-perl 4.25-1 amd64Perl interface to libev, the > high performance event loop AnyEvent's doc [1] mentions that the framework installs (or just might?) it's own SIGCHLD handler. Maybe there are just too many handlers for SIGCHLD? [1] https://metacpan.org/pod/AnyEvent#SIGNALS Best Regards, Michał Mirosław
Bug#966692: stunnel4: FTBFS because of test hang
On Thu, Aug 06, 2020 at 02:39:42AM +0200, Michał Mirosław wrote: > On Thu, Aug 06, 2020 at 03:16:35AM +0300, Peter Pentchev wrote: > > On Thu, Aug 06, 2020 at 12:48:10AM +0200, Michał Mirosław wrote: > > > On Thu, Aug 06, 2020 at 12:29:36AM +0300, Peter Pentchev wrote: > > > > On Wed, Aug 05, 2020 at 10:52:31PM +0200, Michał Mirosław wrote: > > > [...] > > > > > Using print-debugging, I see that it stops at wait_for_child line just > > > > > after printing the version. It seems that something is reaping the > > > > > child > > > > > before the main thread has a chance to wait for it. > > > > > > > > OK, so the only thing that comes to my mind now is that you may be > > > > hitting a crazy, crazy race between register_child() and child_reaper(), > > > > and I say "a crazy, crazy race", because the test has to (apparently > > > > reproducibly) receive the CHLD signal exactly between the check and > > > > the creation in register_child()'s first "$children{...} //= ...cv" > > > > statement. > > > > > > Well, there is nothing that prevents SIGCHLD arriving between fork() and > > > register_child(). You could test this with more confidence (though not > > > 100%-reliably) by putting 'exit 1' just at the start of ($pid == 0) > > > branch. > > > > Nah, the problem is not just "between fork() and register_child()". > > It really must arrive at a very specific moment in time, because > > the //= operations for setting $children{$pid}{cv} try to make sure that > > a new value is not set (that is, a new condition variable is not > > created) if there already is such an element in the array. So the race > > is indeed between the //= in register_child() and the //= in > > child_reaper() - that is, child_reaper() must be invoked (SIGCHLD must > > arrive) *during* the execution of the //= in register_child(). > > > > Unless I'm missing something, which is not at all out of the question :) > > The assignment seems not to be at fault (see last strace). I don't know perl's > internals enough to say if this statement can be interrupted visibly by a > signal > handler (I would guess not a perl handler, though). There are two wait4() > calls > even before child_reaper has a chance to run. Another data point: this happens only with anyevent + libev and not with anyevent + libevent. The first is preferred and installed by default with libanyevent-perl, though. $ dpkg -l libanyevent-perl libev-perl | cat Desired=Unknown/Install/Remove/Purge/Hold | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) ||/ Name Version Architecture Description +++---- ii libanyevent-perl 7.140-3 amd64event loop framework with multiple implementations ii libev-perl 4.25-1 amd64Perl interface to libev, the high performance event loop Best Regards, Michał Mirosław
Bug#966692: stunnel4: FTBFS because of test hang
On Thu, Aug 06, 2020 at 03:16:35AM +0300, Peter Pentchev wrote: > On Thu, Aug 06, 2020 at 12:48:10AM +0200, Michał Mirosław wrote: > > On Thu, Aug 06, 2020 at 12:29:36AM +0300, Peter Pentchev wrote: > > > On Wed, Aug 05, 2020 at 10:52:31PM +0200, Michał Mirosław wrote: > > [...] > > > > Using print-debugging, I see that it stops at wait_for_child line just > > > > after printing the version. It seems that something is reaping the child > > > > before the main thread has a chance to wait for it. > > > > > > OK, so the only thing that comes to my mind now is that you may be > > > hitting a crazy, crazy race between register_child() and child_reaper(), > > > and I say "a crazy, crazy race", because the test has to (apparently > > > reproducibly) receive the CHLD signal exactly between the check and > > > the creation in register_child()'s first "$children{...} //= ...cv" > > > statement. > > > > Well, there is nothing that prevents SIGCHLD arriving between fork() and > > register_child(). You could test this with more confidence (though not > > 100%-reliably) by putting 'exit 1' just at the start of ($pid == 0) branch. > > Nah, the problem is not just "between fork() and register_child()". > It really must arrive at a very specific moment in time, because > the //= operations for setting $children{$pid}{cv} try to make sure that > a new value is not set (that is, a new condition variable is not > created) if there already is such an element in the array. So the race > is indeed between the //= in register_child() and the //= in > child_reaper() - that is, child_reaper() must be invoked (SIGCHLD must > arrive) *during* the execution of the //= in register_child(). > > Unless I'm missing something, which is not at all out of the question :) The assignment seems not to be at fault (see last strace). I don't know perl's internals enough to say if this statement can be interrupted visibly by a signal handler (I would guess not a perl handler, though). There are two wait4() calls even before child_reaper has a chance to run. > > > Can you apply the following patch and show me the output of running > > > the test? > > > > Sure, but I got no patch. :-) > > Oof. Not my day, is it... Here it is... I hope. [cut patch] With the patch applied: $ TEST_STUNNEL=src/stunnel strace -f -o /tmp/log debian/tests/runtime Found the certificate at debian/tests/certs/certificate.pem and the private key at debian/tests/certs/key.pem Using the /tmp/user/1000/EklzlCzeRO temporary directory About to get the stunnel version information register_child: pid 14943 cv AnyEvent::CondVar=HASH(0x559e0d7572e0) RDBG child_reaper() invoked RDBG - pid -1 status -1 RDBG - done RDBG - out of the child_reaper() loop Got stunnel version 5.56 ^C And in the strace log: 14942 epoll_wait(3, [{EPOLLIN, {u32=5, u64=4294967301}}], 64, 59743) = 1 14942 epoll_wait(3, [{EPOLLIN, {u32=5, u64=4294967301}}], 64, 59743) = 1 14943 exit_group(0 14942 read(5, 14943 <... exit_group resumed>) = ? 14942 <... read resumed> "TIMEOUTconnect = 10 seco"..., 8192) = 105 14942 epoll_wait(3, [{EPOLLHUP, {u32=5, u64=4294967301}}], 64, 59743) = 1 14943 +++ exited with 0 +++ 14942 epoll_ctl(3, EPOLL_CTL_MOD, 5, {EPOLLIN, {u32=5, u64=4294967301}}) = 0 14942 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=14943, si_uid=1000, si_status=0, si_utime=0, si_stime=0} --- 14942 write(4, "\1\0\0\0\0\0\0\0", 8) = 8 14942 rt_sigreturn({mask=[PIPE]}) = 0 14942 epoll_wait(3, [{EPOLLHUP, {u32=5, u64=4294967301}}, {EPOLLIN, {u32=4, u64=4294967300}}], 64, 59743) = 2 14942 epoll_ctl(3, EPOLL_CTL_MOD, 5, {EPOLLIN, {u32=5, u64=4294967301}}) = 0 14942 read(4, "\1\0\0\0\0\0\0\0", 8)= 8 14942 wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG|WSTOPPED|WCONTINUED, NULL) = 14943 14942 wait4(-1, 0x7fffcacfddc4, WNOHANG|WSTOPPED|WCONTINUED, NULL) = -1 ECHILD (No child processes) 14942 write(1, "RDBG child_reaper() invoked\n", 28) = 28 14942 wait4(-1, 0x7fffcacfdb74, WNOHANG, NULL) = -1 ECHILD (No child processes) 14942 write(1, "RDBG - pid -1 status -1\n", 24) = 24 14942 write(1, "RDBG - done\n", 14) = 14 14942 write(1, "RDBG - out of the child_reaper()"..., 38) = 38 14942 epoll_wait(3, [{EPOLLHUP, {u32=5, u64=4294967301}}], 64, 59743) = 1 14942 epoll_ctl(3, EPOLL_CTL_MOD, 5, {EPOLLIN, {u32=5, u64=4294967301}}) = 0 14942 read(5, "", 8192) = 0 14942 write(1, "Got stunnel version 5.56\n", 25) = 25 14942 epoll_wait(3, [{EPOLLHUP, {u32=5, u64=4294967301}}], 64, 59743) = 1 14942 epoll_ctl(3, EPOLL_CTL_DEL, 5, 0x559e0dba9760) = 0 14942 epoll_wait(3, 0x559e0dba9760, 64, 59743) = -1 EINTR (Interrupted system call) 14942 --- SIGINT {si_signo=SIGINT, si_code=SI_KERNEL} --- With 'exit 1' in child branch: $ TEST_STUNNEL=src/stunnel strace -f -o /tmp/log debian/tests/runtime Found the certificate at debian/tests/certs/certificate.pem and the private key at debian/tests/certs/key.pem Using the /tmp/user/1000/MWxDeDDlxS temporary
Bug#966692: stunnel4: FTBFS because of test hang
On Thu, Aug 06, 2020 at 12:48:10AM +0200, Michał Mirosław wrote: > On Thu, Aug 06, 2020 at 12:29:36AM +0300, Peter Pentchev wrote: > > On Wed, Aug 05, 2020 at 10:52:31PM +0200, Michał Mirosław wrote: > [...] > > > Using print-debugging, I see that it stops at wait_for_child line just > > > after printing the version. It seems that something is reaping the child > > > before the main thread has a chance to wait for it. > > > > OK, so the only thing that comes to my mind now is that you may be > > hitting a crazy, crazy race between register_child() and child_reaper(), > > and I say "a crazy, crazy race", because the test has to (apparently > > reproducibly) receive the CHLD signal exactly between the check and > > the creation in register_child()'s first "$children{...} //= ...cv" > > statement. > > Well, there is nothing that prevents SIGCHLD arriving between fork() and > register_child(). You could test this with more confidence (though not > 100%-reliably) by putting 'exit 1' just at the start of ($pid == 0) branch. Nah, the problem is not just "between fork() and register_child()". It really must arrive at a very specific moment in time, because the //= operations for setting $children{$pid}{cv} try to make sure that a new value is not set (that is, a new condition variable is not created) if there already is such an element in the array. So the race is indeed between the //= in register_child() and the //= in child_reaper() - that is, child_reaper() must be invoked (SIGCHLD must arrive) *during* the execution of the //= in register_child(). Unless I'm missing something, which is not at all out of the question :) > > Can you apply the following patch and show me the output of running > > the test? > > Sure, but I got no patch. :-) Oof. Not my day, is it... Here it is... I hope. G'luck, Peter -- Peter Pentchev r...@ringlet.net r...@debian.org p...@storpool.com PGP key:http://people.FreeBSD.org/~roam/roam.key.asc Key fingerprint 2EE7 A7A5 17FC 124C F115 C354 651E EFB0 2527 DF13 commit 859acd0603a5bc74620df4949e1450805b7ba151 Author: Peter Pentchev Date: Thu Aug 6 00:26:32 2020 +0300 Diagnostic output for the runtime test's child reaper. diff --git a/debian/tests/runtime b/debian/tests/runtime index f594d9a..81cef23 100755 --- a/debian/tests/runtime +++ b/debian/tests/runtime @@ -55,19 +55,25 @@ sub unregister_child_reaper() sub child_reaper() { + say 'RDBG child_reaper() invoked'; while (1) { my $pid = waitpid -1, WNOHANG; my $status = $?; + say "RDBG - pid $pid status $status"; if (!defined $pid) { die "Could not waitpid() in a SIGCHLD handler: $!\n"; } elsif ($pid == 0 || $pid == -1) { + say 'RDBG - done'; last; } else { + say 'RDBG - '.(exists $children{$pid} ? '' : 'not ').'found in the children hash'; $children{$pid}{cv} //= AnyEvent->condvar; + say 'RDBG - cv '.$children{$pid}{cv}.': '.($children{$pid}{cv}->ready ? '' : 'not ').'ready'; $children{$pid}{cv}->send($status); } } + say 'RDBG - out of the child_reaper() loop'; } sub register_child($ $) @@ -76,6 +82,7 @@ sub register_child($ $) # Weird, but we want it to be at least reasonably atomic-like $children{$pid}{cv} //= AnyEvent->condvar; + say "register_child: pid $pid cv ".$children{$pid}{cv}; my $ch = $children{$pid}; $ch->{pid} = $pid; signature.asc Description: PGP signature
Bug#966692: stunnel4: FTBFS because of test hang
On Thu, Aug 06, 2020 at 12:29:36AM +0300, Peter Pentchev wrote: > On Wed, Aug 05, 2020 at 10:52:31PM +0200, Michał Mirosław wrote: [...] > > Using print-debugging, I see that it stops at wait_for_child line just > > after printing the version. It seems that something is reaping the child > > before the main thread has a chance to wait for it. > > OK, so the only thing that comes to my mind now is that you may be > hitting a crazy, crazy race between register_child() and child_reaper(), > and I say "a crazy, crazy race", because the test has to (apparently > reproducibly) receive the CHLD signal exactly between the check and > the creation in register_child()'s first "$children{...} //= ...cv" > statement. Well, there is nothing that prevents SIGCHLD arriving between fork() and register_child(). You could test this with more confidence (though not 100%-reliably) by putting 'exit 1' just at the start of ($pid == 0) branch. > Can you apply the following patch and show me the output of running > the test? Sure, but I got no patch. :-) Best Regards, Michał Mirosław
Bug#966692: stunnel4: FTBFS because of test hang
On Wed, Aug 05, 2020 at 10:52:31PM +0200, Michał Mirosław wrote: > On Wed, Aug 05, 2020 at 09:28:12PM +0300, Peter Pentchev wrote: > > On Wed, Aug 05, 2020 at 08:01:53PM +0200, Michał Mirosław wrote: > > > On Sun, Aug 02, 2020 at 05:34:40PM +0300, Peter Pentchev wrote: > > > > On Sun, Aug 02, 2020 at 02:02:22AM +0200, Michał Mirosław wrote: > > > [...] > > > > --- a/debian/tests/runtime > > > > +++ b/debian/tests/runtime > > > > @@ -432,6 +432,7 @@ MAIN: > > > > > > > > if (!defined $line) { > > > > $eof->send($got_version); > > > > + undef $f_out; > > > > } elsif (!$got_version) { > > > > if ($line =~ m{^ > > > > stunnel \s+ > > > > > > I believe $f_out will not be defined here, as it only gets set after > > > sub{} is created. Perl confirms this: > > > > > > $ TEST_STUNNEL=src/stunnel strace -f -o /tmp/log debian/tests/runtime > > > Global symbol "$f_out" requires explicit package name (did you forget to > > > declare "my $f_out"?) at debian/tests/runtime line 435. > > > Execution of debian/tests/runtime aborted due to compilation errors. > > > > Of course you're right. Sorry about that! That's what I get for writing > > a patch three minutes before I have to head out and never remembering to > > actually test it later :( > > > > How about the attached one? [snip] > > This stops the endless readings of EOF, but: > > 1. the FD gets leaked (shouldn't matter much, though) > 2. the test hangs anyway > > Using print-debugging, I see that it stops at wait_for_child line just > after printing the version. It seems that something is reaping the child > before the main thread has a chance to wait for it. OK, so the only thing that comes to my mind now is that you may be hitting a crazy, crazy race between register_child() and child_reaper(), and I say "a crazy, crazy race", because the test has to (apparently reproducibly) receive the CHLD signal exactly between the check and the creation in register_child()'s first "$children{...} //= ...cv" statement. Can you apply the following patch and show me the output of running the test? G'luck, Peter -- Peter Pentchev r...@ringlet.net r...@debian.org p...@storpool.com PGP key:http://people.FreeBSD.org/~roam/roam.key.asc Key fingerprint 2EE7 A7A5 17FC 124C F115 C354 651E EFB0 2527 DF13 signature.asc Description: PGP signature
Bug#966692: stunnel4: FTBFS because of test hang
On Wed, Aug 05, 2020 at 09:28:12PM +0300, Peter Pentchev wrote: > On Wed, Aug 05, 2020 at 08:01:53PM +0200, Michał Mirosław wrote: > > On Sun, Aug 02, 2020 at 05:34:40PM +0300, Peter Pentchev wrote: > > > On Sun, Aug 02, 2020 at 02:02:22AM +0200, Michał Mirosław wrote: > > [...] > > > --- a/debian/tests/runtime > > > +++ b/debian/tests/runtime > > > @@ -432,6 +432,7 @@ MAIN: > > > > > > if (!defined $line) { > > > $eof->send($got_version); > > > + undef $f_out; > > > } elsif (!$got_version) { > > > if ($line =~ m{^ > > > stunnel \s+ > > > > I believe $f_out will not be defined here, as it only gets set after > > sub{} is created. Perl confirms this: > > > > $ TEST_STUNNEL=src/stunnel strace -f -o /tmp/log debian/tests/runtime > > Global symbol "$f_out" requires explicit package name (did you forget to > > declare "my $f_out"?) at debian/tests/runtime line 435. > > Execution of debian/tests/runtime aborted due to compilation errors. > > Of course you're right. Sorry about that! That's what I get for writing > a patch three minutes before I have to head out and never remembering to > actually test it later :( > > How about the attached one? [...] > --- a/debian/tests/runtime > +++ b/debian/tests/runtime > @@ -424,7 +424,8 @@ MAIN: > > my ($got_version, $before_version) = (undef, ''); > my $eof = AnyEvent->condvar; > - my $f_out = AnyEvent->io( > + my $f_out; > + $f_out = AnyEvent->io( > fh => $s_in, > poll => 'r', > cb => sub { > @@ -432,6 +433,7 @@ MAIN: > > if (!defined $line) { > $eof->send($got_version); > + undef $f_out; > } elsif (!$got_version) { > if ($line =~ m{^ > stunnel \s+ This stops the endless readings of EOF, but: 1. the FD gets leaked (shouldn't matter much, though) 2. the test hangs anyway Using print-debugging, I see that it stops at wait_for_child line just after printing the version. It seems that something is reaping the child before the main thread has a chance to wait for it. >From strace: 4285 +++ exited with 0 +++ 4284 <... epoll_ctl resumed> ) = 0 4284 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=4285, si_uid=1000, si_status=0, si_utime=0, si_stime=0} --- 4284 write(4, "\1\0\0\0\0\0\0\0", 8) = 8 4284 rt_sigreturn({mask=[PIPE]}) = 0 4284 epoll_wait(3, [{EPOLLHUP, {u32=5, u64=4294967301}}, {EPOLLIN, {u32=4, u64=4294967300}}], 64, 59743) = 2 4284 epoll_ctl(3, EPOLL_CTL_MOD, 5, {EPOLLIN, {u32=5, u64=4294967301}}) = 0 4284 read(4, "\1\0\0\0\0\0\0\0", 8)= 8 4284 wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG|WSTOPPED|WCONTINUED, NULL) = 4285 4284 wait4(-1, 0x7ffcd56d5784, WNOHANG|WSTOPPED|WCONTINUED, NULL) = -1 ECHILD (No child processes) 4284 wait4(-1, 0x7ffcd56d5534, WNOHANG, NULL) = -1 ECHILD (No child processes) This is before the 'wait_for_child' a few lines later. 4284 epoll_wait(3, [{EPOLLHUP, {u32=5, u64=4294967301}}], 64, 59743) = 1 4284 epoll_ctl(3, EPOLL_CTL_MOD, 5, {EPOLLIN, {u32=5, u64=4294967301}}) = 0 4284 read(5, "", 8192) = 0 4284 write(1, "Got stunnel version 5.56\n", 25) = 25 4284 write(2, "wait child 4285\n", 16) = 16 This is version printout plus my check. 4284 epoll_wait(3, [{EPOLLHUP, {u32=5, u64=4294967301}}], 64, 59743) = 1 4284 epoll_ctl(3, EPOLL_CTL_DEL, 5, 0x5631c0d44a40) = 0 This clears watcher for the pipe. 4284 epoll_wait(3, 0x5631c0d44a40, 64, 59743) = -1 EINTR (Interrupted system call) And this waits forever. Best Regards Michał Mirosław
Bug#966692: stunnel4: FTBFS because of test hang
On Wed, Aug 05, 2020 at 08:01:53PM +0200, Michał Mirosław wrote: > On Sun, Aug 02, 2020 at 05:34:40PM +0300, Peter Pentchev wrote: > > On Sun, Aug 02, 2020 at 02:02:22AM +0200, Michał Mirosław wrote: > [...] > > --- a/debian/tests/runtime > > +++ b/debian/tests/runtime > > @@ -432,6 +432,7 @@ MAIN: > > > > if (!defined $line) { > > $eof->send($got_version); > > + undef $f_out; > > } elsif (!$got_version) { > > if ($line =~ m{^ > > stunnel \s+ > > I believe $f_out will not be defined here, as it only gets set after > sub{} is created. Perl confirms this: > > $ TEST_STUNNEL=src/stunnel strace -f -o /tmp/log debian/tests/runtime > Global symbol "$f_out" requires explicit package name (did you forget to > declare "my $f_out"?) at debian/tests/runtime line 435. > Execution of debian/tests/runtime aborted due to compilation errors. Of course you're right. Sorry about that! That's what I get for writing a patch three minutes before I have to head out and never remembering to actually test it later :( How about the attached one? G'luck, Peter -- Peter Pentchev r...@ringlet.net r...@debian.org p...@storpool.com PGP key:http://people.FreeBSD.org/~roam/roam.key.asc Key fingerprint 2EE7 A7A5 17FC 124C F115 C354 651E EFB0 2527 DF13 commit eb303ad2e9c925bf7243e6877d8598d0356d68f9 Author: Peter Pentchev Date: Sun Aug 2 17:31:26 2020 +0300 Destroy the stunnel version watcher on EOF. diff --git a/debian/tests/runtime b/debian/tests/runtime index ecffe7b..f594d9a 100755 --- a/debian/tests/runtime +++ b/debian/tests/runtime @@ -424,7 +424,8 @@ MAIN: my ($got_version, $before_version) = (undef, ''); my $eof = AnyEvent->condvar; - my $f_out = AnyEvent->io( + my $f_out; + $f_out = AnyEvent->io( fh => $s_in, poll => 'r', cb => sub { @@ -432,6 +433,7 @@ MAIN: if (!defined $line) { $eof->send($got_version); + undef $f_out; } elsif (!$got_version) { if ($line =~ m{^ stunnel \s+ signature.asc Description: PGP signature
Bug#966692: stunnel4: FTBFS because of test hang
On Sun, Aug 02, 2020 at 05:34:40PM +0300, Peter Pentchev wrote: > On Sun, Aug 02, 2020 at 02:02:22AM +0200, Michał Mirosław wrote: [...] > --- a/debian/tests/runtime > +++ b/debian/tests/runtime > @@ -432,6 +432,7 @@ MAIN: > > if (!defined $line) { > $eof->send($got_version); > + undef $f_out; > } elsif (!$got_version) { > if ($line =~ m{^ > stunnel \s+ I believe $f_out will not be defined here, as it only gets set after sub{} is created. Perl confirms this: $ TEST_STUNNEL=src/stunnel strace -f -o /tmp/log debian/tests/runtime Global symbol "$f_out" requires explicit package name (did you forget to declare "my $f_out"?) at debian/tests/runtime line 435. Execution of debian/tests/runtime aborted due to compilation errors. $ dpkg -l perl Desired=Unknown/Install/Remove/Purge/Hold | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) ||/ Name Version Architecture Description +++-==---= ii perl 5.28.1-6 amd64Larry Wall's Practical Extraction and Report Language Best Regards Michał Mirosław
Bug#966692: stunnel4: FTBFS because of test hang
On Sun, Aug 02, 2020 at 02:02:22AM +0200, Michał Mirosław wrote: > Source: stunnel4 > Severity: serious > Tags: ftbfs > Justification: fails to build from source (but built successfully in the past) > X-Debbugs-Cc: Michał Mirosław > > The bug #955710 turns out to be only partially fixed. Now without a binary > to exec the test runtime doesn't hang. When the binary is correct, though, > it still can't stop asking the pipe for more data after EOF. > > $ dpkg-buildpackage -b > [...] > env TEST_STUNNEL=.../stunnel4-5.56+dfsg/src/stunnel debian/tests/runtime > Found the certificate at debian/tests/certs/certificate.pem and the private > key at debian/tests/certs/key.pem > Using the /tmp/AgCSdk8YKw temporary directory > About to get the stunnel version information > Got stunnel version 5.56 > ^Cmake[1]: *** [debian/rules:32: execute_before_dh_auto_test] Interrupt > make: *** [debian/rules:106: binary] Error 1 > > From the manual run under strace: > > $ TEST_STUNNEL=src/stunnel strace -f -o /tmp/log debian/tests/runtime > > I can see (excerpt): [snip useful excerpt] Hi, Thanks a lot for reporting this and for the strace output! Can you check whether the attached trivial patch will work for you? There is a slightly more verbose way to do pretty much the same, but let's see if this works first. If it does... sorry, I really should have done this from the start! G'luck, Peter -- Peter Pentchev r...@ringlet.net r...@debian.org p...@storpool.com PGP key:http://people.FreeBSD.org/~roam/roam.key.asc Key fingerprint 2EE7 A7A5 17FC 124C F115 C354 651E EFB0 2527 DF13 commit fb2a11a3654e3070084515da89f415361671f922 Author: Peter Pentchev Date: Sun Aug 2 17:31:26 2020 +0300 Destroy the stunnel version watcher on EOF. diff --git a/debian/tests/runtime b/debian/tests/runtime index ecffe7b..109c965 100755 --- a/debian/tests/runtime +++ b/debian/tests/runtime @@ -432,6 +432,7 @@ MAIN: if (!defined $line) { $eof->send($got_version); + undef $f_out; } elsif (!$got_version) { if ($line =~ m{^ stunnel \s+ signature.asc Description: PGP signature
Bug#966692: stunnel4: FTBFS because of test hang
Source: stunnel4 Severity: serious Tags: ftbfs Justification: fails to build from source (but built successfully in the past) X-Debbugs-Cc: Michał Mirosław The bug #955710 turns out to be only partially fixed. Now without a binary to exec the test runtime doesn't hang. When the binary is correct, though, it still can't stop asking the pipe for more data after EOF. $ dpkg-buildpackage -b [...] env TEST_STUNNEL=.../stunnel4-5.56+dfsg/src/stunnel debian/tests/runtime Found the certificate at debian/tests/certs/certificate.pem and the private key at debian/tests/certs/key.pem Using the /tmp/AgCSdk8YKw temporary directory About to get the stunnel version information Got stunnel version 5.56 ^Cmake[1]: *** [debian/rules:32: execute_before_dh_auto_test] Interrupt make: *** [debian/rules:106: binary] Error 1 From the manual run under strace: $ TEST_STUNNEL=src/stunnel strace -f -o /tmp/log debian/tests/runtime I can see (excerpt): 11715 read(5, 11715 <... read resumed> "stunnel 5.56 on x86_64-pc-linux-"..., 8192) = 95 [more reads] 11715 read(5, "TIMEOUTbusy= 300 sec"..., 8192) = 178 11715 epoll_wait(3, 11715 <... epoll_wait resumed> [{EPOLLHUP, {u32=5, u64=4294967301}}], 64, 59743) = 1 11715 epoll_ctl(3, EPOLL_CTL_MOD, 5, {EPOLLIN, {u32=5, u64=4294967301}} [11716 +++ exited with 0 +++] 11715 <... epoll_ctl resumed> ) = 0 11715 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=11716, si_uid=1000, si_status=0, si_utime=0, si_stime=0} --- 11715 write(4, "\1\0\0\0\0\0\0\0", 8) = 8 11715 rt_sigreturn({mask=[]}) = 0 11715 epoll_wait(3, [{EPOLLHUP, {u32=5, u64=4294967301}}, {EPOLLIN, {u32=4, u64=4294967300}}], 64, 59743) = 2 11715 epoll_ctl(3, EPOLL_CTL_MOD, 5, {EPOLLIN, {u32=5, u64=4294967301}}) = 0 11715 read(4, "\1\0\0\0\0\0\0\0", 8)= 8 11715 wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG|WSTOPPED|WCONTINUED, NULL) = 11716 11715 wait4(-1, 0x7ffc665c7904, WNOHANG|WSTOPPED|WCONTINUED, NULL) = -1 ECHILD (No child processes) 11715 wait4(-1, 0x7ffc665c76b4, WNOHANG, NULL) = -1 ECHILD (No child processes) 11715 epoll_wait(3, [{EPOLLHUP, {u32=5, u64=4294967301}}], 64, 59743) = 1 11715 epoll_ctl(3, EPOLL_CTL_MOD, 5, {EPOLLIN, {u32=5, u64=4294967301}}) = 0 11715 epoll_wait(3, [{EPOLLHUP, {u32=5, u64=4294967301}}], 64, 59743) = 1 11715 epoll_ctl(3, EPOLL_CTL_MOD, 5, {EPOLLIN, {u32=5, u64=4294967301}}) = 0 11715 epoll_wait(3, [{EPOLLHUP, {u32=5, u64=4294967301}}], 64, 59743) = 1 11715 epoll_ctl(3, EPOLL_CTL_MOD, 5, {EPOLLIN, {u32=5, u64=4294967301}}) = 0 11715 read(5, "", 8192) = 0 11715 write(1, "Got stunnel version 5.56\n", 25) = 25 11715 epoll_wait(3, [{EPOLLHUP, {u32=5, u64=4294967301}}], 64, 59743) = 1 11715 epoll_ctl(3, EPOLL_CTL_MOD, 5, {EPOLLIN, {u32=5, u64=4294967301}}) = 0 11715 read(5, "", 8192) = 0 [last 3 lines repeated forever] -- System Information: Debian Release: 10.5 APT prefers stable-debug APT policy: (900, 'stable-debug'), (900, 'proposed-updates'), (900, 'stable'), (800, 'testing'), (700, 'unstable'), (600, 'experimental'), (540, 'oldstable'), (500, 'unstable-debug'), (500, 'testing-debug'), (500, 'proposed-updates-debug'), (1, 'experimental-debug') Architecture: amd64 (x86_64) Foreign Architectures: i386 Kernel: Linux 5.7.10mq+ (SMP w/8 CPU threads; PREEMPT) Kernel taint flags: TAINT_OOT_MODULE Locale: LANG=pl_PL.UTF-8, LC_CTYPE=pl_PL.UTF-8 (charmap=UTF-8), LANGUAGE not set Shell: /bin/sh linked to /bin/dash Init: systemd (via /run/systemd/system) -- no debconf information test-runtime.strace.log.gz Description: application/gzip