Re: Bug#673594: ruby1.8: FTBFS[kfreebsd-*]: test-all hangs/segfaults
clone 673594 -1 severity -1 important retitle -1 ruby1.8: threaded code segfaults under kfreebsd-* tags 673594 + pending thanks Hi Steven, Steven Chamberlain escreveu isso aĆ: Whereas the buildds experience hangs during some tests, I see segfaults instead. This sometimes happens even before the first test has been run. This small Ruby testcase results in segfault 50% of the time under ruby1.8 1.8.7.358-2, but always succeeds with ruby1.9.1 1.9.3.0-2: require 'thread' Thread.new do foo = bar end (Measured out of 100 runs, on kfreebsd-i386 with 4-way SMP) Attached are outputs from ktrace for a success and from a failure; then I've tried to diff them. There seems to be a race whereby thread0 tries to call thr_kill on thread2, but if that happens too late thread2 will trigger a segfault instead. Thanks for the patch. I am preparing an upload to workaround the test timeout and make the FTBFS go away ASAP. If you can prepare a patch to fix the race condition, please attach it to the new bug report which I am creating by cloning this one. -- Antonio Terceiro terce...@debian.org signature.asc Description: Digital signature
Re: Bug#673594: ruby1.8: FTBFS[kfreebsd-*]: test-all hangs/segfaults
found 673594 1.8.7.352-2 tags 673594 + patch thanks Hi, What about using the attached patch to time out the test-all suite if it hangs, as was done for ruby1.9.1, because its exit status is ignored anyway (some failures are expected, on all arches). I think a workaround like this is needed to at least fix the FTBFS since there are security patches and s390x stuff all waiting on it. The version of ruby1.8 in testing seems to already have this problem (it only built for kfreebsd-i386 on the 4th attempt). We can separately follow up on working out why some of the tests hang; probably thread-related races in eglibc and/or the tests themselves. Thanks, Regards, -- Steven Chamberlain ste...@pyro.eu.org diff --git a/debian/control b/debian/control index c8d77e0..db229e5 100644 --- a/debian/control +++ b/debian/control @@ -3,7 +3,7 @@ Section: ruby Priority: optional Maintainer: akira yamada ak...@debian.org Uploaders: Daigo Moriwaki da...@debian.org, Lucas Nussbaum lu...@debian.org, Antonio Terceiro terce...@debian.org -Build-Depends: cdbs (= 0.4.106), debhelper (= 5), autotools-dev, autoconf, m4, quilt (= 0.40), patch, bison, binutils (= 2.14.90.0.7), libgdbm-dev, libncurses5-dev, libreadline-gplv2-dev, tcl-dev, tk-dev, zlib1g-dev, libssl-dev (= 0.9.6b), file +Build-Depends: cdbs (= 0.4.106), debhelper (= 5), autotools-dev, autoconf, m4, quilt (= 0.40), patch, bison, binutils (= 2.14.90.0.7), libgdbm-dev, libncurses5-dev, libreadline-gplv2-dev, tcl-dev, tk-dev, zlib1g-dev, libssl-dev (= 0.9.6b), file, coreutils Standards-Version: 3.9.2 Homepage: http://www.ruby-lang.org/ Vcs-Git: git://git.debian.org/collab-maint/ruby1.8.git diff --git a/debian/rules b/debian/rules index e238759..1456921 100755 --- a/debian/rules +++ b/debian/rules @@ -62,7 +62,7 @@ DEB_MAKE_BUILD_TARGET = all test common-post-build-arch:: ifeq (,$(filter nocheck,$(DEB_BUILD_OPTIONS))) - -make test-all + -timeout 1200 make test-all endif
Re: Bug#673594: ruby1.8: FTBFS[kfreebsd-*]: test-all hangs/segfaults
Whereas the buildds experience hangs during some tests, I see segfaults instead. This sometimes happens even before the first test has been run. This small Ruby testcase results in segfault 50% of the time under ruby1.8 1.8.7.358-2, but always succeeds with ruby1.9.1 1.9.3.0-2: require 'thread' Thread.new do foo = bar end (Measured out of 100 runs, on kfreebsd-i386 with 4-way SMP) Attached are outputs from ktrace for a success and from a failure; then I've tried to diff them. There seems to be a race whereby thread0 tries to call thr_kill on thread2, but if that happens too late thread2 will trigger a segfault instead. Regards, -- Steven Chamberlain ste...@pyro.eu.org --- ok.txt 2012-05-20 20:56:17.734917958 +0100 +++ fail.txt2012-05-20 20:58:12.337235026 +0100 @@ -356,7 +356,7 @@ thread0 ruby1.8 RET open 3 thread0 ruby1.8 CALL read(0x3,0xbfbfe61c,0x4) thread0 ruby1.8 GIO fd 3 read 4 bytes - 0x f0be 5f81 |.._.| + 0x d52b 6642 |.+fB| thread0 ruby1.8 RET read 4 thread0 ruby1.8 CALL close(0x3) @@ -411,7 +411,7 @@ thread0 ruby1.8 CALL gettimeofday(0xbfbfe5b8,0) thread0 ruby1.8 RET gettimeofday 0 thread0 ruby1.8 CALL getpid -thread0 ruby1.8 RET getpid 50320/0xc490 +thread0 ruby1.8 RET getpid 50346/0xc4aa thread0 ruby1.8 CALL break(0x808d000) thread0 ruby1.8 RET break 0 thread0 ruby1.8 CALL sigaction(SIGINT,0xbfbfe564,0xbfbfe5b8) @@ -750,65 +750,49 @@ thread2 ruby1.8 RET sigprocmask 0 thread2 ruby1.8 CALL clock_gettime(0,0x28b68eb8) thread2 ruby1.8 RET clock_gettime 0 +thread2 ruby1.8 CALL sigprocmask(SIG_SETMASK,0x28b68e90,0) +thread2 ruby1.8 RET sigprocmask 0 +thread2 ruby1.8 CALL clock_gettime(0,0x28b68f30) +thread2 ruby1.8 RET clock_gettime 0 +thread2 ruby1.8 CALL sigprocmask(SIG_BLOCK,0,0x28b68e80) +thread2 ruby1.8 RET sigprocmask 0 +thread0 ruby1.8 PSIG SIGSEGV caught handler=0x2818ba50 mask=0x8000 code=0x1 +thread2 ruby1.8 CALL sigprocmask(SIG_UNBLOCK,0x28b68ea0,0x28b68e90) +thread2 ruby1.8 RET sigprocmask 0 +thread0 ruby1.8 CALL write(0x2,0xbfbfbf9c,0xb) +thread2 ruby1.8 CALL clock_gettime(0,0x28b68eb8) +thread2 ruby1.8 RET clock_gettime 0 +thread0 ruby1.8 GIO fd 2 wrote 11 bytes + test.rb:4: thread2 ruby1.8 CALL nanosleep(0x28b68eb0,0) -thread0 ruby1.8 CALL thr_kill(thread2,SIG(null)) -thread0 ruby1.8 RET thr_kill 0 -thread2 ruby1.8 RET nanosleep -1 errno 4 Interrupted system call -thread0 ruby1.8 CALL sigprocmask(SIG_SETMASK,0,0xbfbfdc4c) -thread2 ruby1.8 PSIG SIG(null) caught handler=0x28188860 mask=0x7ffefeff code=0x10001 +thread0 ruby1.8 RET write 11/0xb +thread2 ruby1.8 RET nanosleep -1 errno 22 Invalid argument +thread0 ruby1.8 CALL write(0x2,0x2813751f,0x6) +thread2 ruby1.8 CALL clock_gettime(0,0x28b68eb8) +thread0 ruby1.8 GIO fd 2 wrote 6 bytes + [BUG] +thread2 ruby1.8 RET clock_gettime 0 +thread0 ruby1.8 RET write 6 +thread2 ruby1.8 CALL nanosleep(0x28b68eb0,0) +thread2 ruby1.8 RET nanosleep -1 errno 22 Invalid argument +thread0 ruby1.8 CALL write(0x2,0xbfbf98a0,0x12) +thread2 ruby1.8 CALL clock_gettime(0,0x28b68eb8) +thread0 ruby1.8 GIO fd 2 wrote 18 bytes + Segmentation fault +thread2 ruby1.8 RET clock_gettime 0 +thread0 ruby1.8 RET write 18/0x12 +thread2 ruby1.8 CALL nanosleep(0x28b68eb0,0) +thread0 ruby1.8 CALL write(0x2,0xbfbf98a0,0x3d) +thread0 ruby1.8 GIO fd 2 wrote 61 bytes + +(2012-02-08 patchlevel 358) [i486-kfreebsd-gnu] + + +thread0 ruby1.8 RET write 61/0x3d +thread0 ruby1.8 CALL sigprocmask(SIG_UNBLOCK,0xbfbfbf50,0) thread0 ruby1.8 RET sigprocmask 0 -thread2 ruby1.8 CALL sigprocmask(SIG_SETMASK,0x28b68e80,0) -thread0 ruby1.8 CALL sigsuspend(0xbfbfdc4c) -thread2 ruby1.8 RET sigprocmask 0 -thread2 ruby1.8 CALL thr_kill(thread0,SIG(null)) -thread2 ruby1.8 RET thr_kill 0 -thread0 ruby1.8 PSIG SIG(null) caught handler=0x28188860 mask=0x8000 code=0x10001 -thread2 ruby1.8 CALL thr_kill(thread1,SIG(null)) -thread0 ruby1.8 RET sigsuspend JUSTRETURN -thread2 ruby1.8 RET thr_kill 0 -thread0 ruby1.8 CALL sigreturn(0xbfbfd930) +thread0 ruby1.8 CALL thr_kill(thread0,SIGIOT) +thread0 ruby1.8 RET thr_kill 0 +thread0 ruby1.8 PSIG SIGIOT SIG_DFL code=0x10001 thread1 ruby1.8 RET poll -1 errno 4 Interrupted system call -thread0 ruby1.8 RET sigreturn JUSTRETURN -thread1 ruby1.8 PSIG SIG(null) caught handler=0x28188710 mask=0xfffefeef code=0x10001 -thread2 ruby1.8 CALL thr_exit(0x807c1c0) -thread0 ruby1.8 CALL write(0x4,0xbfbfdc8c,0x24) -thread1 ruby1.8 CALL sigreturn(0x807a850) -thread0 ruby1.8 GIO fd 4 wrote 36 bytes - 0x a01a 3328 0100 0204 44a1 0e00 0060 0828 bc2a 1828 70fd 1628 d0d5 1728 e0dc
Bug#673594: ruby1.8: FTBFS[kfreebsd-*]: test-all hangs/segfaults
Package: src:ruby1.8 Version: 1.8.7.352-2 Severity: serious Tags: sid wheezy User: debian-bsd@lists.debian.org Usertags: kfreebsd X-Debbugs-Cc: k...@debian.org X-Debbugs-Cc: debian-bsd@lists.debian.org Justification: fails to build from source (but built successfully in the past) Hi, On 20/05/12 01:19, Cyril Brulebois wrote: https://buildd.debian.org/status/logs.php?arch=kfreebsd-amd64pkg=ruby1.8ver=1.8.7.358-2 Seems that this issue *rarely* happens during kfreebsd-i386 builds too (in the same place, but test_safe_04 isn't necessarily at fault). https://buildd.debian.org/status/fetch.php?pkg=ruby1.8arch=kfreebsd-i386ver=1.8.7.352-2stamp=1313126333 : test_safe_04(TestERBCoreWOStrScan): . E: Caught signal 'Terminated': terminating immediately make[1]: *** [test-all] Terminated make: *** [common-post-build-arch] Terminated test_cd(TestFileUtils): Build killed with signal TERM after 150 minutes of inactivity When I try this myself, I hit segfaults in the testsuite before it even gets that far. :( The result of the test-all suite is ignored anyway. Something was added for ruby1.9.1, to time out any tests that hang -- maybe we could use it here too: http://anonscm.debian.org/gitweb/?p=collab-maint/ruby1.9.1.git;a=commitdiff;h=6c64e43924695aec1f995202a032fb2e0e955eb3 Also #593139 might have something relevant to fixing ruby1.8. Regards, -- Steven Chamberlain ste...@pyro.eu.org steven@kfreebsd-i386:~/ruby1.8-1.8.7.358$ gdb ruby1.8 -c ruby1.8.core -s debian/libruby1.8-dbg/usr/lib/debug/usr/bin/ruby1.8 GNU gdb (GDB) 7.4.1-debian Copyright (C) 2012 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type show copying and show warranty for details. This GDB was configured as i486-kfreebsd-gnu. For bug reporting instructions, please see: http://www.gnu.org/software/gdb/bugs/... Reading symbols from /home/steven/ruby1.8-1.8.7.358/ruby1.8...done. [New process 100385] [New process 101043] [New process 101042] Core was generated by `ruby1.8'. Program terminated with signal 6, Aborted. #0 0x282c95f6 in syscall () from /lib/i386-kfreebsd-gnu/i686/cmov/libc.so.0.1 (gdb) thread apply all bt Thread 3 (process 101042): #0 0x282c1202 in poll () from /lib/i386-kfreebsd-gnu/i686/cmov/libc.so.0.1 #1 0x281869ee in __pthread_manager () from /lib/i386-kfreebsd-gnu/i686/cmov/libpthread.so.0 #2 0x in ?? () Thread 2 (process 101043): #0 0x2818c272 in nanosleep () from /lib/i386-kfreebsd-gnu/i686/cmov/libpthread.so.0 #1 0x28187e0f in __pthread_timedsuspend_new_clk () from /lib/i386-kfreebsd-gnu/i686/cmov/libpthread.so.0 #2 0x28185bce in pthread_cond_timedwait@GLIBC_2.3 () from /lib/i386-kfreebsd-gnu/i686/cmov/libpthread.so.0 #3 0x280967b9 in thread_timer (dummy=0xbfbf81f8) at eval.c:12325 #4 0x28186671 in pthread_start_thread () from /lib/i386-kfreebsd-gnu/i686/cmov/libpthread.so.0 #5 0x in ?? () Thread 1 (process 100385): #0 0x282c95f6 in syscall () from /lib/i386-kfreebsd-gnu/i686/cmov/libc.so.0.1 #1 0x2818937b in pthread_kill () from /lib/i386-kfreebsd-gnu/i686/cmov/libpthread.so.0 #2 0x281893b6 in raise () from /lib/i386-kfreebsd-gnu/i686/cmov/libpthread.so.0 #3 0x2822e624 in raise () from /lib/i386-kfreebsd-gnu/i686/cmov/libc.so.0.1 #4 0x282316c3 in abort () from /lib/i386-kfreebsd-gnu/i686/cmov/libc.so.0.1 #5 0x28091929 in rb_bug (fmt=fmt@entry=0x28132286 Segmentation fault) at error.c:213 #6 0x28100469 in sigsegv (sig=optimized out) at signal.c:634 #7 sigsegv (sig=11) at signal.c:622 #8 0x2818bb47 in __pthread_sighandler () from /lib/i386-kfreebsd-gnu/i686/cmov/libpthread.so.0 #9 signal handler called #10 0x in ?? () #11 0x2c10742c in ?? () Backtrace stopped: previous frame inner to this frame (corrupt stack?)