Processed: Re: Bug#785557: perl: FTBFS on i386 and amd64: itimer problems on buildds?
Processing control commands: severity -1 important Bug #785557 [perl] perl: FTBFS on i386 and amd64: itimer problems on buildds? Severity set to 'important' from 'serious' retitle -1 perl: FTBFS on buildds with steal time issues Bug #785557 [perl] perl: FTBFS on i386 and amd64: itimer problems on buildds? Changed Bug title to 'perl: FTBFS on buildds with steal time issues' from 'perl: FTBFS on i386 and amd64: itimer problems on buildds?' -- 785557: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=785557 Debian Bug Tracking System Contact ow...@bugs.debian.org with problems -- To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#785557: perl: FTBFS on i386 and amd64: itimer problems on buildds?
Control: severity -1 important Control: retitle -1 perl: FTBFS on buildds with steal time issues On Sun, May 24, 2015 at 07:38:19PM +0300, Apollon Oikonomopoulos wrote: On 16:38 Sun 24 May , Ben Hutchings wrote: On Sun, 2015-05-24 at 14:09 +0300, Niko Tyni wrote: On Sun, May 24, 2015 at 02:55:00PM +0800, Paul Wise wrote: On Sat, 2015-05-23 at 19:10 +0200, Dominic Hargreaves wrote: This is rather strange; any ideas from DSA? The underlying hosts do not have the same issue. All of the guests use the same virtual CPU version/flags. All of the guests use the same Linux kernel version. Thanks for the update. I guess diving into the Linux implementation of times(2) for clues would be the next step for figuring out what the issue is here. I'm taking the kernel maintainers in the loop. The status here is that times(2) seems to be misbehaving on some i386 and amd64 debian.org virtual hosts running jessie (under ganeti/qemu, with jessie on the underlying hosts too). These hosts include at least barriere and x86-grnet-01. The misbehaviour is that user time stays at zero all the time, as seen for example with 'time yes'. This is making perl fail to build from source due to test failures, and I'd expect it to affect other things too. Any help is appreciated. I can't reproduce this, but wonder if it's related to #784960? There seems to be something fundamentally broken in barriere.debian.org's CPU time accounting, not related to times(2) per se. Just issuing yes /dev/null and firing up top -d1 gives the following interesting results: - `yes' shows up taking 100% CPU time as expected, but - pressing `1' shows that all CPUs are idle (!) htop OTOH displays all CPUs as constantly 100% busy, which is inconsistent with the system's load average (~0.8 at that point). Also watching the output of `cat /proc/$(pidof yes)/stat | awk '{ print $14, $15 }'' ($14 is utime, $15 is stime per proc(5)) indeed shows 100% system time and 0 user time. If you look at the `top' stats for all CPUs of barriere.debian.org, it looks as if the only thing that's correctly being accounted for is iowait time. Great to hear that you found the underlying cause[1] of this! I note the workaround: -cpu qemu64,-kvm_steal_time which may be applicable to the Debian hosts? Cheers, Dominic. [1] https://lists.gnu.org/archive/html/qemu-devel/2015-06/msg01295.html -- To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#785557: perl: FTBFS on i386 and amd64: itimer problems on buildds?
On Sun, May 24, 2015 at 07:38:19PM +0300, Apollon Oikonomopoulos wrote: On 16:38 Sun 24 May , Ben Hutchings wrote: On Sun, 2015-05-24 at 14:09 +0300, Niko Tyni wrote: On Sun, May 24, 2015 at 02:55:00PM +0800, Paul Wise wrote: On Sat, 2015-05-23 at 19:10 +0200, Dominic Hargreaves wrote: This is rather strange; any ideas from DSA? The underlying hosts do not have the same issue. All of the guests use the same virtual CPU version/flags. All of the guests use the same Linux kernel version. Thanks for the update. I guess diving into the Linux implementation of times(2) for clues would be the next step for figuring out what the issue is here. I'm taking the kernel maintainers in the loop. The status here is that times(2) seems to be misbehaving on some i386 and amd64 debian.org virtual hosts running jessie (under ganeti/qemu, with jessie on the underlying hosts too). These hosts include at least barriere and x86-grnet-01. The misbehaviour is that user time stays at zero all the time, as seen for example with 'time yes'. This is making perl fail to build from source due to test failures, and I'd expect it to affect other things too. Any help is appreciated. I can't reproduce this, but wonder if it's related to #784960? There seems to be something fundamentally broken in barriere.debian.org's CPU time accounting, not related to times(2) per se. Just issuing yes /dev/null and firing up top -d1 gives the following interesting results: - `yes' shows up taking 100% CPU time as expected, but - pressing `1' shows that all CPUs are idle (!) htop OTOH displays all CPUs as constantly 100% busy, which is inconsistent with the system's load average (~0.8 at that point). Also watching the output of `cat /proc/$(pidof yes)/stat | awk '{ print $14, $15 }'' ($14 is utime, $15 is stime per proc(5)) indeed shows 100% system time and 0 user time. If you look at the `top' stats for all CPUs of barriere.debian.org, it looks as if the only thing that's correctly being accounted for is iowait time. It looks like the same thing has happened again on x86-grnet-01, meaning we have issues[1] on x86-grnet-01 brahms binet but not babin x86-csail-01 Buildd admins: please can the amd64 build of perl 5.22.0~rc2-2 be given-back to see if it lands on a working host? DSA: can you identify any differences between the working hosts and the others which would help identify the root of this problem - assuming that they all exhibit the same easy to reproduce behaviour seen above? Thanks! Dominic. [1] https://buildd.debian.org/status/logs.php?pkg=perlarch=amd64 -- To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#785557: perl: FTBFS on i386 and amd64: itimer problems on buildds?
On Mon, Jun 01, 2015 at 04:14:32PM +0100, Dominic Hargreaves wrote: Buildd admins: please can the amd64 build of perl 5.22.0~rc2-2 be given-back to see if it lands on a working host? Given back. Kurt -- To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#785557: perl: FTBFS on i386 and amd64: itimer problems on buildds?
On Sun, May 24, 2015 at 02:55:00PM +0800, Paul Wise wrote: On Sat, 2015-05-23 at 19:10 +0200, Dominic Hargreaves wrote: This is rather strange; any ideas from DSA? The underlying hosts do not have the same issue. All of the guests use the same virtual CPU version/flags. All of the guests use the same Linux kernel version. Thanks for the update. I guess diving into the Linux implementation of times(2) for clues would be the next step for figuring out what the issue is here. I'm taking the kernel maintainers in the loop. The status here is that times(2) seems to be misbehaving on some i386 and amd64 debian.org virtual hosts running jessie (under ganeti/qemu, with jessie on the underlying hosts too). These hosts include at least barriere and x86-grnet-01. The misbehaviour is that user time stays at zero all the time, as seen for example with 'time yes'. This is making perl fail to build from source due to test failures, and I'd expect it to affect other things too. Any help is appreciated. -- Niko Tyni nt...@debian.org -- To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#785557: perl: FTBFS on i386 and amd64: itimer problems on buildds?
On 16:38 Sun 24 May , Ben Hutchings wrote: On Sun, 2015-05-24 at 14:09 +0300, Niko Tyni wrote: On Sun, May 24, 2015 at 02:55:00PM +0800, Paul Wise wrote: On Sat, 2015-05-23 at 19:10 +0200, Dominic Hargreaves wrote: This is rather strange; any ideas from DSA? The underlying hosts do not have the same issue. All of the guests use the same virtual CPU version/flags. All of the guests use the same Linux kernel version. Thanks for the update. I guess diving into the Linux implementation of times(2) for clues would be the next step for figuring out what the issue is here. I'm taking the kernel maintainers in the loop. The status here is that times(2) seems to be misbehaving on some i386 and amd64 debian.org virtual hosts running jessie (under ganeti/qemu, with jessie on the underlying hosts too). These hosts include at least barriere and x86-grnet-01. The misbehaviour is that user time stays at zero all the time, as seen for example with 'time yes'. This is making perl fail to build from source due to test failures, and I'd expect it to affect other things too. Any help is appreciated. I can't reproduce this, but wonder if it's related to #784960? There seems to be something fundamentally broken in barriere.debian.org's CPU time accounting, not related to times(2) per se. Just issuing yes /dev/null and firing up top -d1 gives the following interesting results: - `yes' shows up taking 100% CPU time as expected, but - pressing `1' shows that all CPUs are idle (!) htop OTOH displays all CPUs as constantly 100% busy, which is inconsistent with the system's load average (~0.8 at that point). Also watching the output of `cat /proc/$(pidof yes)/stat | awk '{ print $14, $15 }'' ($14 is utime, $15 is stime per proc(5)) indeed shows 100% system time and 0 user time. If you look at the `top' stats for all CPUs of barriere.debian.org, it looks as if the only thing that's correctly being accounted for is iowait time. Cheers, Apollon -- To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#785557: perl: FTBFS on i386 and amd64: itimer problems on buildds?
On Sun, 2015-05-24 at 14:09 +0300, Niko Tyni wrote: On Sun, May 24, 2015 at 02:55:00PM +0800, Paul Wise wrote: On Sat, 2015-05-23 at 19:10 +0200, Dominic Hargreaves wrote: This is rather strange; any ideas from DSA? The underlying hosts do not have the same issue. All of the guests use the same virtual CPU version/flags. All of the guests use the same Linux kernel version. Thanks for the update. I guess diving into the Linux implementation of times(2) for clues would be the next step for figuring out what the issue is here. I'm taking the kernel maintainers in the loop. The status here is that times(2) seems to be misbehaving on some i386 and amd64 debian.org virtual hosts running jessie (under ganeti/qemu, with jessie on the underlying hosts too). These hosts include at least barriere and x86-grnet-01. The misbehaviour is that user time stays at zero all the time, as seen for example with 'time yes'. This is making perl fail to build from source due to test failures, and I'd expect it to affect other things too. Any help is appreciated. I can't reproduce this, but wonder if it's related to #784960? Ben. -- Ben Hutchings Experience is what causes a person to make new mistakes instead of old ones. signature.asc Description: This is a digitally signed message part
Bug#785557: perl: FTBFS on i386 and amd64: itimer problems on buildds?
On Sat, 2015-05-23 at 19:10 +0200, Dominic Hargreaves wrote: This is rather strange; any ideas from DSA? The underlying hosts do not have the same issue. All of the guests use the same virtual CPU version/flags. All of the guests use the same Linux kernel version. I guess diving into the Linux implementation of times(2) for clues would be the next step for figuring out what the issue is here. -- bye, pabs https://wiki.debian.org/PaulWise signature.asc Description: This is a digitally signed message part
Bug#785557: perl: FTBFS on i386 and amd64: itimer problems on buildds?
On Wed, May 20, 2015 at 09:01:26PM +0100, Dominic Hargreaves wrote: On Mon, May 18, 2015 at 09:44:24AM +0200, Martin Zobel-Helas wrote: Hi, On Sun May 17, 2015 at 22:18:52 +0300, Niko Tyni wrote: DSA (also cc'd): What's the virtualization setup with x86-grnet-01, brahms and binet? Is there a difference with babin, which managed to build the i386 binaries? Are the underlying virtualization hosts running jessie too? ganeti, using qemu for all architectures. The underlying virtualization hosts for x86-grnet-01 and brahms run jessie, binet's virtualization host still runs wheezy. All VMs are bootstraped by DSA with the same script, which is available at [1]. After the bootstrapping is done, buildd maintainers take over and setup the buildd on the VM. It seems to be reproducible on barriere.debian.org: t/op/time . # Failed test 2 - very basic times test at op/time.t line 33 FAILED at test 2 More digging needed. The test calls the times(2) system call, expecting to see the real issue can be reduced to: dom@barriere:~$ time yes /dev/null ^C real0m2.768s user0m0.000s sys 0m2.764s We'd expect to instead 'yes' take 'user' time, like this: dom@himalia:~$ time yes /dev/null ^C real0m2.686s user0m2.656s sys 0m0.032s This is rather strange; any ideas from DSA? Cheers, Dominic. -- To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#785557: perl: FTBFS on i386 and amd64: itimer problems on buildds?
On Mon, May 18, 2015 at 09:44:24AM +0200, Martin Zobel-Helas wrote: Hi, On Sun May 17, 2015 at 22:18:52 +0300, Niko Tyni wrote: DSA (also cc'd): What's the virtualization setup with x86-grnet-01, brahms and binet? Is there a difference with babin, which managed to build the i386 binaries? Are the underlying virtualization hosts running jessie too? ganeti, using qemu for all architectures. The underlying virtualization hosts for x86-grnet-01 and brahms run jessie, binet's virtualization host still runs wheezy. All VMs are bootstraped by DSA with the same script, which is available at [1]. After the bootstrapping is done, buildd maintainers take over and setup the buildd on the VM. It seems to be reproducible on barriere.debian.org: t/op/time . # Failed test 2 - very basic times test at op/time.t line 33 FAILED at test 2 More digging needed. -- To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#785557: perl: FTBFS on i386 and amd64: itimer problems on buildds?
Hi, On Sun May 17, 2015 at 22:18:52 +0300, Niko Tyni wrote: DSA (also cc'd): What's the virtualization setup with x86-grnet-01, brahms and binet? Is there a difference with babin, which managed to build the i386 binaries? Are the underlying virtualization hosts running jessie too? ganeti, using qemu for all architectures. The underlying virtualization hosts for x86-grnet-01 and brahms run jessie, binet's virtualization host still runs wheezy. All VMs are bootstraped by DSA with the same script, which is available at [1]. After the bootstrapping is done, buildd maintainers take over and setup the buildd on the VM. Cheers, Martin [1]http://anonscm.debian.org/cgit/mirror/dsa-misc.git/tree/scripts/VM-installs/ganeti-unified -- Martin Zobel-Helas zo...@debian.orgDebian System Administrator Debian GNU/Linux Developer Debian Listmaster http://about.me/zobel Debian Webmaster GPG Fingerprint: 6B18 5642 8E41 EC89 3D5D BDBB 53B1 AC6D B11B 627B -- To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#785557: perl: FTBFS on i386 and amd64: itimer problems on buildds?
Package: perl Version: 5.20.2-3 Severity: serious Tags: unreproducible help stretch sid X-Debbugs-Cc: Kurt Roeckx k...@roeckx.be, debian-ad...@lists.debian.org As discussed in the thread at http://lists.alioth.debian.org/pipermail/perl-maintainers/2015-May/004855.html perl currently fails to build on several i386 and amd64 buildds due to failing / hanging timer related tests. https://buildd.debian.org/status/logs.php?pkg=perlarch=amd64 https://buildd.debian.org/status/logs.php?pkg=perlarch=i386 Kurt (cc'd) says the buildds were recently upgraded to jessie, except that x86-grnet-01 was running jessie already in late March when the first failure of this kind happened (5.20.2-3/i386). So it looks like the tests only fail when the buildd is running jessie, but not on every host and/or every build. So far we haven't been able to reproduce this outside the buildds. DSA (also cc'd): What's the virtualization setup with x86-grnet-01, brahms and binet? Is there a difference with babin, which managed to build the i386 binaries? Are the underlying virtualization hosts running jessie too? Reporting this against the perl version in stable, but I'm not sure yet if it affects stable buildds. I expect it does, but tagging as stretch+sid for now. Clearly something somewhere needs to be fixed before a release... # Failed test 2 - very basic times test at op/time.t line 33 t/op/time . FAILED at test 2 t/itimer.t: overall time allowed for tests (360s) exceeded! cpan/Time-HiRes/t/itimer .. FAILED--expected 2 tests, saw 1 lib/Benchmark . makefile:807: recipe for target 'test' failed Build killed with signal TERM after 150 minutes of inactivity -- Niko Tyni nt...@debian.org -- To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org