Bug#721421: libdevel-bt-perl: FTBFS on armel, hurd-i386, kfreebsd-amd64
tag 721421 patch thanks On Fri, Sep 26, 2014 at 11:12:06PM +0300, Niko Tyni wrote: The problem apparently happens when the timeout in the select loop (one second) triggers before execvp() has been called. I can reproduce a similar race on my x86_64 machine by inserting a sleep(1) call right before the execvp() call. I still haven't got to the bottom of it, but it looks like the gdb output is lost somewhere with select() timeouting (and returning zero) on subsequent calls too even though gdb has happily written to the pipe. Further investigation with strace shows that the fd_set passed into select() becomes empty if execvp() happens after the first select() call. I was able to reproduce this with gdb replaced by a trivial program that just prints to stdout (which greatly helped debugging.) So I suppose the execvp() call somehow invalidates the fd set? I haven't found an explanation for this observed behaviour. The closest thing I was able to find was this in the select_tut(2) Linux manual page (on Debian sid if that matters): 11. Since select() modifies its file descriptor sets, if the call is being used in a loop, then the sets must be reinitialized before each call. Reinitializing the set in the loop fixes it and seems to be the correct thing to do anyway. Patch attached, this makes it work for me on both mips and amd64. -- Niko Tyni nt...@debian.org From 88d953d71051fe45a4983f1cce9810f7ae942c56 Mon Sep 17 00:00:00 2001 From: Niko Tyni nt...@debian.org Date: Sat, 27 Sep 2014 10:35:27 +0300 Subject: [PATCH] Reinitialize the fd set in the select loop This fixes test failures on slow hosts. It looks like execvp() happening in the child after the first select() call invalidates the set. Quoting the Linux select_tut(2) manual page: 11. Since select() modifies its file descriptor sets, if the call is being used in a loop, then the sets must be reinitialized before each call. Bug-Debian: https://bugs.debian.org/721421 --- bt.xs | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/bt.xs b/bt.xs index 6b9fed6..f892c8c 100644 --- a/bt.xs +++ b/bt.xs @@ -88,9 +88,6 @@ stack_trace (char **args) _exit(0); } -FD_ZERO(fdset); -FD_SET(out_fd[0], fdset); - write(in_fd[1], thread apply all backtrace\n, 27); write(in_fd[1], quit\n, 5); @@ -105,6 +102,9 @@ stack_trace (char **args) tv.tv_sec = 1; tv.tv_usec = 0; +FD_ZERO(fdset); +FD_SET(out_fd[0], fdset); + sel = select(FD_SETSIZE, fdset, NULL, NULL, tv); if (sel == -1) break; -- 2.1.1
Bug#721421: libdevel-bt-perl: FTBFS on armel, hurd-i386, kfreebsd-amd64
On Sat, Sep 27, 2014 at 9:48 AM, Niko Tyni nt...@debian.org wrote: tag 721421 patch thanks On Fri, Sep 26, 2014 at 11:12:06PM +0300, Niko Tyni wrote: The problem apparently happens when the timeout in the select loop (one second) triggers before execvp() has been called. I can reproduce a similar race on my x86_64 machine by inserting a sleep(1) call right before the execvp() call. I still haven't got to the bottom of it, but it looks like the gdb output is lost somewhere with select() timeouting (and returning zero) on subsequent calls too even though gdb has happily written to the pipe. Further investigation with strace shows that the fd_set passed into select() becomes empty if execvp() happens after the first select() call. I was able to reproduce this with gdb replaced by a trivial program that just prints to stdout (which greatly helped debugging.) So I suppose the execvp() call somehow invalidates the fd set? I haven't found an explanation for this observed behaviour. The closest thing I was able to find was this in the select_tut(2) Linux manual page (on Debian sid if that matters): 11. Since select() modifies its file descriptor sets, if the call is being used in a loop, then the sets must be reinitialized before each call. Reinitializing the set in the loop fixes it and seems to be the correct thing to do anyway. Patch attached, this makes it work for me on both mips and amd64. Right, that is definitely a bug. Haven't used select in such a long time that I had looked over that insanity. Leon
Bug#721421: libdevel-bt-perl: FTBFS on armel, hurd-i386, kfreebsd-amd64
On Thu, Sep 25, 2014 at 03:49:20PM +0200, Leon Timmermans wrote: On Tue, Sep 23, 2014 at 9:59 PM, Niko Tyni nt...@debian.org wrote: I also had a look at the mips one, and there the problem doesn't seem to be with the backtrace, as running gdb separately works as expected. However, running perl with -d:bt doesn't seem to do anything. It looks like the host is just too slow; inserting a 'sleep(1)' just before the thread apply all backtrace command in stack_trace() fixes it for me. Perhaps the code should just check that the fd is ready for writing? This should not matter. Pipes are buffered at a kernel level. This is not making sense to me. Right. Sorry for not looking into it better. The problem apparently happens when the timeout in the select loop (one second) triggers before execvp() has been called. I can reproduce a similar race on my x86_64 machine by inserting a sleep(1) call right before the execvp() call. I still haven't got to the bottom of it, but it looks like the gdb output is lost somewhere with select() timeouting (and returning zero) on subsequent calls too even though gdb has happily written to the pipe. Will continue investigation. -- Niko Tyni nt...@debian.org -- To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#721421: libdevel-bt-perl: FTBFS on armel, hurd-i386, kfreebsd-amd64
On Tue, 2014-09-23 22:59:46 +0300, Niko Tyni wrote: I also had a look at the mips one, and there the problem doesn't seem to be with the backtrace, as running gdb separately works as expected. However, running perl with -d:bt doesn't seem to do anything. It looks like the host is just too slow; inserting a 'sleep(1)' just before the thread apply all backtrace command in stack_trace() fixes it for me. Perhaps the code should just check that the fd is ready for writing? With Niko's workaround, libdevel-bt-perl-0.06 builds on mips and mipsel. debdiff libdevel-bt-perl_0.06-2.dsc libdevel-bt-perl_0.06-2.1.dsc diff -Nru libdevel-bt-perl-0.06/debian/changelog libdevel-bt-perl-0.06/debian/changelog --- libdevel-bt-perl-0.06/debian/changelog 2014-09-04 13:24:11.0 +0100 +++ libdevel-bt-perl-0.06/debian/changelog 2014-09-25 12:08:43.0 +0100 @@ -1,3 +1,13 @@ +libdevel-bt-perl (0.06-2.1) unstable; urgency=medium + + * Non-maintainer upload. + * Fix FTBFS on mips and mipsel. +Add sleep-bt.xs.patch. +Patch by Niko Tyni. +Closes: #721421. + + -- Anibal Monsalve Salazar ani...@debian.org Thu, 25 Sep 2014 12:08:40 +0100 + libdevel-bt-perl (0.06-2) unstable; urgency=low [ gregor herrmann ] diff -Nru libdevel-bt-perl-0.06/debian/patches/series libdevel-bt-perl-0.06/debian/patches/series --- libdevel-bt-perl-0.06/debian/patches/series 2014-09-04 13:24:11.0 +0100 +++ libdevel-bt-perl-0.06/debian/patches/series 2014-09-25 12:07:45.0 +0100 @@ -1,2 +1,3 @@ hurd_path_max.patch 0001-Raise-instead-of-kill-the-signal.patch +sleep-bt.xs.patch diff -Nru libdevel-bt-perl-0.06/debian/patches/sleep-bt.xs.patch libdevel-bt-perl-0.06/debian/patches/sleep-bt.xs.patch --- libdevel-bt-perl-0.06/debian/patches/sleep-bt.xs.patch 1970-01-01 01:00:00.0 +0100 +++ libdevel-bt-perl-0.06/debian/patches/sleep-bt.xs.patch 2014-09-25 12:07:59.0 +0100 @@ -0,0 +1,25 @@ +From: Niko Tyni nt...@debian.org +Subject: Re: Bug#721421: libdevel-bt-perl: FTBFS on armel, hurd-i386, kfreebsd-amd64 +Date: Tue, 23 Sep 2014 22:59:46 +0300 + +https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=721421 + +I had a look at the mips one, and there the problem doesn't seem to be with the +backtrace, as running gdb separately works as expected. However, running perl +with -d:bt doesn't seem to do anything. It looks like the host is just too +slow; inserting a 'sleep(1)' just before the thread apply all backtrace +command in stack_trace() fixes it for me. Perhaps the code should just check +that the fd is ready for writing? + +Index: libdevel-bt-perl-0.06/bt.xs +=== +--- libdevel-bt-perl-0.06.orig/bt.xs libdevel-bt-perl-0.06/bt.xs +@@ -91,6 +91,7 @@ stack_trace (char **args) + FD_ZERO(fdset); + FD_SET(out_fd[0], fdset); + ++sleep(1); + write(in_fd[1], thread apply all backtrace\n, 27); + write(in_fd[1], quit\n, 5); + signature.asc Description: Digital signature
Bug#721421: libdevel-bt-perl: FTBFS on armel, hurd-i386, kfreebsd-amd64
On Tue, Sep 23, 2014 at 9:59 PM, Niko Tyni nt...@debian.org wrote: I also had a look at the mips one, and there the problem doesn't seem to be with the backtrace, as running gdb separately works as expected. However, running perl with -d:bt doesn't seem to do anything. It looks like the host is just too slow; inserting a 'sleep(1)' just before the thread apply all backtrace command in stack_trace() fixes it for me. Perhaps the code should just check that the fd is ready for writing? This should not matter. Pipes are buffered at a kernel level. This is not making sense to me. Leon
Bug#721421: libdevel-bt-perl: FTBFS on armel, hurd-i386, kfreebsd-amd64
On Mon, Sep 22, 2014 at 12:21 AM, gregor herrmann gre...@debian.org wrote: Here we go: That's armhf on a Debian box in an unstable chroot: (sid_armhf-dchroot)gregoa@harris:~$ (echo r; echo bt; echo quit) | gdb --args perl -e 'unpack p, pack L!, 1' | egrep '^#' #1 0xb6f3f048 in Perl_newSVpv () from /usr/lib/arm-linux-gnueabihf/libperl.so.5.20 #2 0x00040002 in ?? () That looks wrong to me. Does a debugging perl show the same result? Leon
Bug#721421: libdevel-bt-perl: FTBFS on armel, hurd-i386, kfreebsd-amd64
On Tue, 23 Sep 2014 17:31:52 +0200, Leon Timmermans wrote: That's armhf on a Debian box in an unstable chroot: (sid_armhf-dchroot)gregoa@harris:~$ (echo r; echo bt; echo quit) | gdb --args perl -e 'unpack p, pack L!, 1' | egrep '^#' #1 0xb6f3f048 in Perl_newSVpv () from /usr/lib/arm-linux-gnueabihf/libperl.so.5.20 #2 0x00040002 in ?? () That looks wrong to me. Does a debugging perl show the same result? Let's see: Same machine, same chroot, this time with perl-debug installed: (sid_armhf-dchroot)gregoa@harris:~$ (echo r; echo bt; echo quit) | gdb --args perl -e 'unpack p, pack L!, 1' | egrep '^#' #1 0xb6f3f048 in Perl_newSVpv (my_perl=0x22008, s=0x1 error: Cannot access memory at address 0x1, len=0) #2 0x00040002 in ?? () FTR: $ perl -v This is perl 5, version 20, subversion 1 (v5.20.1) built for arm-linux-gnueabihf-thread-multi-64int (with 37 registered patches, see perl -V for more detail) Cheers, gregor -- .''`. Homepage: http://info.comodo.priv.at/ - OpenPGP key 0xBB3A68018649AA06 : :' : Debian GNU/Linux user, admin, and developer - http://www.debian.org/ `. `' Member of VIBE!AT SPI, fellow of the Free Software Foundation Europe `- NP: Mark Knopfler: Irish Love signature.asc Description: Digital Signature
Bug#721421: libdevel-bt-perl: FTBFS on armel, hurd-i386, kfreebsd-amd64
On Tue, Sep 23, 2014 at 05:50:00PM +0200, gregor herrmann wrote: On Tue, 23 Sep 2014 17:31:52 +0200, Leon Timmermans wrote: That's armhf on a Debian box in an unstable chroot: (sid_armhf-dchroot)gregoa@harris:~$ (echo r; echo bt; echo quit) | gdb --args perl -e 'unpack p, pack L!, 1' | egrep '^#' #1 0xb6f3f048 in Perl_newSVpv () from /usr/lib/arm-linux-gnueabihf/libperl.so.5.20 #2 0x00040002 in ?? () That looks wrong to me. Does a debugging perl show the same result? Let's see: Same machine, same chroot, this time with perl-debug installed: (sid_armhf-dchroot)gregoa@harris:~$ (echo r; echo bt; echo quit) | gdb --args perl -e 'unpack p, pack L!, 1' | egrep '^#' #1 0xb6f3f048 in Perl_newSVpv (my_perl=0x22008, s=0x1 error: Cannot access memory at address 0x1, len=0) #2 0x00040002 in ?? () I've reproduced this armhf issue and filed #762620 against gdb about it. I think removing the armhf binaries of libdevel-bt-perl is a reasonable workaround for this, but let's give the gdb maintainer a bit of time to investigate first. (Removing the binaries would mean that we don't provide an armhf version of the package unless it starts working again.) I also had a look at the mips one, and there the problem doesn't seem to be with the backtrace, as running gdb separately works as expected. However, running perl with -d:bt doesn't seem to do anything. It looks like the host is just too slow; inserting a 'sleep(1)' just before the thread apply all backtrace command in stack_trace() fixes it for me. Perhaps the code should just check that the fd is ready for writing? -- Niko Tyni nt...@debian.org -- To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#721421: libdevel-bt-perl: FTBFS on armel, hurd-i386, kfreebsd-amd64
On Thu, Sep 04, 2014 at 02:25:44PM +0200, gregor herrmann wrote: On Mon, 01 Sep 2014 19:24:18 +0200, Leon Timmermans wrote: The attached patch might fix the issue on Hurd, I can't really say much about the issue on armel or kfreebsd-amd64 without having some build/test output from them though. Thanks for the patch, I've applied it now and uploaded the new version. Build logs (with TEST_VERBOSE=1) will show up shortly at https://buildd.debian.org/status/package.php?p=libdevel-bt-perl or https://buildd.debian.org/status/logs.php?pkg=libdevel-bt-perl Hurd is still failing, although that's not the priority at the moment. It's just armhf and mips we care about, since the bug is only considered release-critical if it's a regression, and it's never built successfully on kFreeBSD, Hurd or arm64 (and at least Hurd isn't a release architecture). Here are some log extracts from recent builds: armhf: https://buildd.debian.org/status/fetch.php?pkg=libdevel-bt-perlarch=armhfver=0.06-2stamp=1409851075 mips: https://buildd.debian.org/status/fetch.php?pkg=libdevel-bt-perlarch=mipsver=0.06-2stamp=1410539249 Are those log extracts sufficient for you to advise on the correct fix? Thanks! Dominic. -- To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#721421: libdevel-bt-perl: FTBFS on armel, hurd-i386, kfreebsd-amd64
On Sun, Sep 21, 2014 at 7:37 PM, Dominic Hargreaves d...@earth.li wrote: On Thu, Sep 04, 2014 at 02:25:44PM +0200, gregor herrmann wrote: On Mon, 01 Sep 2014 19:24:18 +0200, Leon Timmermans wrote: The attached patch might fix the issue on Hurd, I can't really say much about the issue on armel or kfreebsd-amd64 without having some build/test output from them though. Thanks for the patch, I've applied it now and uploaded the new version. Build logs (with TEST_VERBOSE=1) will show up shortly at https://buildd.debian.org/status/package.php?p=libdevel-bt-perl or https://buildd.debian.org/status/logs.php?pkg=libdevel-bt-perl Hurd is still failing, although that's not the priority at the moment. It's just armhf and mips we care about, since the bug is only considered release-critical if it's a regression, and it's never built successfully on kFreeBSD, Hurd or arm64 (and at least Hurd isn't a release architecture). Here are some log extracts from recent builds: armhf: https://buildd.debian.org/status/fetch.php?pkg=libdevel-bt-perlarch=armhfver=0.06-2stamp=1409851075 That suggests the issue is missing symbols in the gdb output. What is the output on such a machine of: (echo r; echo bt; echo quit) | gdb --args perl -e 'unpack p, pack L!, 1' | egrep '^#' Does that include a perl_run entry? mips: https://buildd.debian.org/status/fetch.php?pkg=libdevel-bt-perlarch=mipsver=0.06-2stamp=1410539249 That shows the gdb output to be empty. That's either a bug in devel-bt or a bug in gdb. I'd say the former is more likely, but I can't diagnose it at a distance. Leon
Bug#721421: libdevel-bt-perl: FTBFS on armel, hurd-i386, kfreebsd-amd64
On Sun, 21 Sep 2014 23:55:45 +0200, Leon Timmermans wrote: Here are some log extracts from recent builds: armhf: https://buildd.debian.org/status/fetch.php?pkg=libdevel-bt-perlarch=armhfver=0.06-2stamp=1409851075 That suggests the issue is missing symbols in the gdb output. What is the output on such a machine of: (echo r; echo bt; echo quit) | gdb --args perl -e 'unpack p, pack L!, 1' | egrep '^#' Does that include a perl_run entry? Here we go: That's armhf on a Debian box in an unstable chroot: (sid_armhf-dchroot)gregoa@harris:~$ (echo r; echo bt; echo quit) | gdb --args perl -e 'unpack p, pack L!, 1' | egrep '^#' #1 0xb6f3f048 in Perl_newSVpv () from /usr/lib/arm-linux-gnueabihf/libperl.so.5.20 #2 0x00040002 in ?? () /* Just for fun, armhf/jessie on Rasbian: gregoa@guinan ~ % (echo r; echo bt; echo quit) | gdb --args perl -e 'unpack p, pack L!, 1' | egrep '^#' Cannot access memory at address 0x0 #1 0xb6f062f8 in Perl_newSVpv () from /usr/lib/arm-linux-gnueabihf/libperl.so.5.20 #2 0xb6f767d8 in ?? () from /usr/lib/arm-linux-gnueabihf/libperl.so.5.20 */ Cheers, gregor -- .''`. Homepage: http://info.comodo.priv.at/ - OpenPGP key 0xBB3A68018649AA06 : :' : Debian GNU/Linux user, admin, and developer - http://www.debian.org/ `. `' Member of VIBE!AT SPI, fellow of the Free Software Foundation Europe `- NP: David Bowie: Rebel Rebel signature.asc Description: Digital Signature
Bug#721421: libdevel-bt-perl: FTBFS on armel, hurd-i386, kfreebsd-amd64
On Mon, 01 Sep 2014 19:24:18 +0200, Leon Timmermans wrote: The attached patch might fix the issue on Hurd, I can't really say much about the issue on armel or kfreebsd-amd64 without having some build/test output from them though. Thanks for the patch, I've applied it now and uploaded the new version. Build logs (with TEST_VERBOSE=1) will show up shortly at https://buildd.debian.org/status/package.php?p=libdevel-bt-perl or https://buildd.debian.org/status/logs.php?pkg=libdevel-bt-perl Cheers, gregor -- .''`. Homepage: http://info.comodo.priv.at/ - OpenPGP key 0xBB3A68018649AA06 : :' : Debian GNU/Linux user, admin, and developer - http://www.debian.org/ `. `' Member of VIBE!AT SPI, fellow of the Free Software Foundation Europe `- NP: Alanis Morissette: Hand In My Pocket signature.asc Description: Digital Signature
Bug#721421: libdevel-bt-perl: FTBFS on armel, hurd-i386, kfreebsd-amd64
The attached patch might fix the issue on Hurd, I can't really say much about the issue on armel or kfreebsd-amd64 without having some build/test output from them though. Leon From 7243c7acfa7731697dfd75e930906817588c9c2f Mon Sep 17 00:00:00 2001 From: Leon Timmermans faw...@gmail.com Date: Mon, 1 Sep 2014 11:53:23 +0200 Subject: [PATCH] Raise instead of kill the signal --- t/basic.t | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/t/basic.t b/t/basic.t index 4519941..95fbb90 100644 --- a/t/basic.t +++ b/t/basic.t @@ -18,7 +18,7 @@ local $ENV{PERL5LIB} = join $Config::Config{path_sep} = @INC; for my $signal (@signals) { next unless __PACKAGE__-can($signal); my $signum = __PACKAGE__-can($signal)-(); -my @cmd = ($^X, qw(-d:bt -e), kill $signum, \$\$); +my @cmd = ($^X, qw(-d:bt -MPOSIX=raise -e), raise($signum)); use Capture::Tiny 'capture'; my ($stdout) = capture { system @cmd }; -- 1.9.1