Re: Bug#673594: ruby1.8: FTBFS[kfreebsd-*]: test-all hangs/segfaults

2012-05-24 Thread Antonio Terceiro
clone 673594 -1
severity -1 important
retitle -1 ruby1.8: threaded code segfaults under kfreebsd-*
tags 673594 + pending
thanks

Hi Steven,

Steven Chamberlain escreveu isso aĆ­:
 Whereas the buildds experience hangs during some tests, I see segfaults
 instead.  This sometimes happens even before the first test has been run.
 
 This small Ruby testcase results in segfault 50% of the time under
 ruby1.8 1.8.7.358-2, but always succeeds with ruby1.9.1 1.9.3.0-2:
 
  require 'thread'
  Thread.new do
  foo = bar
  end
 
 (Measured out of 100 runs, on kfreebsd-i386 with 4-way SMP)
 
 Attached are outputs from ktrace for a success and from a failure;  then
 I've tried to diff them.  There seems to be a race whereby thread0 tries
 to call thr_kill on thread2, but if that happens too late thread2 will
 trigger a segfault instead.

Thanks for the patch. I am preparing an upload to workaround the test
timeout and make the FTBFS go away ASAP.

If you can prepare a patch to fix the race condition, please attach it
to the new bug report which I am creating by cloning this one.

-- 
Antonio Terceiro terce...@debian.org


signature.asc
Description: Digital signature


Re: Bug#673594: ruby1.8: FTBFS[kfreebsd-*]: test-all hangs/segfaults

2012-05-20 Thread Steven Chamberlain
found 673594 1.8.7.352-2
tags 673594 + patch
thanks

Hi,

What about using the attached patch to time out the test-all suite if it
hangs, as was done for ruby1.9.1, because its exit status is ignored
anyway (some failures are expected, on all arches).

I think a workaround like this is needed to at least fix the FTBFS since
there are security patches and s390x stuff all waiting on it.  The
version of ruby1.8 in testing seems to already have this problem (it
only built for kfreebsd-i386 on the 4th attempt).

We can separately follow up on working out why some of the tests hang;
probably thread-related races in eglibc and/or the tests themselves.

Thanks,
Regards,
-- 
Steven Chamberlain
ste...@pyro.eu.org
diff --git a/debian/control b/debian/control
index c8d77e0..db229e5 100644
--- a/debian/control
+++ b/debian/control
@@ -3,7 +3,7 @@ Section: ruby
 Priority: optional
 Maintainer: akira yamada ak...@debian.org
 Uploaders: Daigo Moriwaki da...@debian.org, Lucas Nussbaum lu...@debian.org, Antonio Terceiro terce...@debian.org
-Build-Depends: cdbs (= 0.4.106), debhelper (= 5), autotools-dev, autoconf, m4, quilt (= 0.40), patch, bison, binutils (= 2.14.90.0.7), libgdbm-dev, libncurses5-dev, libreadline-gplv2-dev, tcl-dev, tk-dev, zlib1g-dev, libssl-dev (= 0.9.6b), file
+Build-Depends: cdbs (= 0.4.106), debhelper (= 5), autotools-dev, autoconf, m4, quilt (= 0.40), patch, bison, binutils (= 2.14.90.0.7), libgdbm-dev, libncurses5-dev, libreadline-gplv2-dev, tcl-dev, tk-dev, zlib1g-dev, libssl-dev (= 0.9.6b), file, coreutils
 Standards-Version: 3.9.2
 Homepage: http://www.ruby-lang.org/
 Vcs-Git: git://git.debian.org/collab-maint/ruby1.8.git
diff --git a/debian/rules b/debian/rules
index e238759..1456921 100755
--- a/debian/rules
+++ b/debian/rules
@@ -62,7 +62,7 @@ DEB_MAKE_BUILD_TARGET = all test
 
 common-post-build-arch::
 ifeq (,$(filter nocheck,$(DEB_BUILD_OPTIONS)))
-	-make test-all
+	-timeout 1200 make test-all
 endif
 
 


Re: Bug#673594: ruby1.8: FTBFS[kfreebsd-*]: test-all hangs/segfaults

2012-05-20 Thread Steven Chamberlain
Whereas the buildds experience hangs during some tests, I see segfaults
instead.  This sometimes happens even before the first test has been run.

This small Ruby testcase results in segfault 50% of the time under
ruby1.8 1.8.7.358-2, but always succeeds with ruby1.9.1 1.9.3.0-2:

 require 'thread'
 Thread.new do
 foo = bar
 end

(Measured out of 100 runs, on kfreebsd-i386 with 4-way SMP)

Attached are outputs from ktrace for a success and from a failure;  then
I've tried to diff them.  There seems to be a race whereby thread0 tries
to call thr_kill on thread2, but if that happens too late thread2 will
trigger a segfault instead.

Regards,
-- 
Steven Chamberlain
ste...@pyro.eu.org
--- ok.txt  2012-05-20 20:56:17.734917958 +0100
+++ fail.txt2012-05-20 20:58:12.337235026 +0100
@@ -356,7 +356,7 @@
 thread0 ruby1.8  RET   open 3
 thread0 ruby1.8  CALL  read(0x3,0xbfbfe61c,0x4)
 thread0 ruby1.8  GIO   fd 3 read 4 bytes
- 0x f0be 5f81  
|.._.|
+ 0x d52b 6642  
|.+fB|
 
 thread0 ruby1.8  RET   read 4
 thread0 ruby1.8  CALL  close(0x3)
@@ -411,7 +411,7 @@
 thread0 ruby1.8  CALL  gettimeofday(0xbfbfe5b8,0)
 thread0 ruby1.8  RET   gettimeofday 0
 thread0 ruby1.8  CALL  getpid
-thread0 ruby1.8  RET   getpid 50320/0xc490
+thread0 ruby1.8  RET   getpid 50346/0xc4aa
 thread0 ruby1.8  CALL  break(0x808d000)
 thread0 ruby1.8  RET   break 0
 thread0 ruby1.8  CALL  sigaction(SIGINT,0xbfbfe564,0xbfbfe5b8)
@@ -750,65 +750,49 @@
 thread2 ruby1.8  RET   sigprocmask 0
 thread2 ruby1.8  CALL  clock_gettime(0,0x28b68eb8)
 thread2 ruby1.8  RET   clock_gettime 0
+thread2 ruby1.8  CALL  sigprocmask(SIG_SETMASK,0x28b68e90,0)
+thread2 ruby1.8  RET   sigprocmask 0
+thread2 ruby1.8  CALL  clock_gettime(0,0x28b68f30)
+thread2 ruby1.8  RET   clock_gettime 0
+thread2 ruby1.8  CALL  sigprocmask(SIG_BLOCK,0,0x28b68e80)
+thread2 ruby1.8  RET   sigprocmask 0
+thread0 ruby1.8  PSIG  SIGSEGV caught handler=0x2818ba50 mask=0x8000 
code=0x1
+thread2 ruby1.8  CALL  sigprocmask(SIG_UNBLOCK,0x28b68ea0,0x28b68e90)
+thread2 ruby1.8  RET   sigprocmask 0
+thread0 ruby1.8  CALL  write(0x2,0xbfbfbf9c,0xb)
+thread2 ruby1.8  CALL  clock_gettime(0,0x28b68eb8)
+thread2 ruby1.8  RET   clock_gettime 0
+thread0 ruby1.8  GIO   fd 2 wrote 11 bytes
+ test.rb:4: 
 thread2 ruby1.8  CALL  nanosleep(0x28b68eb0,0)
-thread0 ruby1.8  CALL  thr_kill(thread2,SIG(null))
-thread0 ruby1.8  RET   thr_kill 0
-thread2 ruby1.8  RET   nanosleep -1 errno 4 Interrupted system call
-thread0 ruby1.8  CALL  sigprocmask(SIG_SETMASK,0,0xbfbfdc4c)
-thread2 ruby1.8  PSIG  SIG(null) caught handler=0x28188860 mask=0x7ffefeff 
code=0x10001
+thread0 ruby1.8  RET   write 11/0xb
+thread2 ruby1.8  RET   nanosleep -1 errno 22 Invalid argument
+thread0 ruby1.8  CALL  write(0x2,0x2813751f,0x6)
+thread2 ruby1.8  CALL  clock_gettime(0,0x28b68eb8)
+thread0 ruby1.8  GIO   fd 2 wrote 6 bytes
+ [BUG] 
+thread2 ruby1.8  RET   clock_gettime 0
+thread0 ruby1.8  RET   write 6
+thread2 ruby1.8  CALL  nanosleep(0x28b68eb0,0)
+thread2 ruby1.8  RET   nanosleep -1 errno 22 Invalid argument
+thread0 ruby1.8  CALL  write(0x2,0xbfbf98a0,0x12)
+thread2 ruby1.8  CALL  clock_gettime(0,0x28b68eb8)
+thread0 ruby1.8  GIO   fd 2 wrote 18 bytes
+ Segmentation fault
+thread2 ruby1.8  RET   clock_gettime 0
+thread0 ruby1.8  RET   write 18/0x12
+thread2 ruby1.8  CALL  nanosleep(0x28b68eb0,0)
+thread0 ruby1.8  CALL  write(0x2,0xbfbf98a0,0x3d)
+thread0 ruby1.8  GIO   fd 2 wrote 61 bytes
+ 
+(2012-02-08 patchlevel 358) [i486-kfreebsd-gnu]
+   
+ 
+thread0 ruby1.8  RET   write 61/0x3d
+thread0 ruby1.8  CALL  sigprocmask(SIG_UNBLOCK,0xbfbfbf50,0)
 thread0 ruby1.8  RET   sigprocmask 0
-thread2 ruby1.8  CALL  sigprocmask(SIG_SETMASK,0x28b68e80,0)
-thread0 ruby1.8  CALL  sigsuspend(0xbfbfdc4c)
-thread2 ruby1.8  RET   sigprocmask 0
-thread2 ruby1.8  CALL  thr_kill(thread0,SIG(null))
-thread2 ruby1.8  RET   thr_kill 0
-thread0 ruby1.8  PSIG  SIG(null) caught handler=0x28188860 mask=0x8000 
code=0x10001
-thread2 ruby1.8  CALL  thr_kill(thread1,SIG(null))
-thread0 ruby1.8  RET   sigsuspend JUSTRETURN
-thread2 ruby1.8  RET   thr_kill 0
-thread0 ruby1.8  CALL  sigreturn(0xbfbfd930)
+thread0 ruby1.8  CALL  thr_kill(thread0,SIGIOT)
+thread0 ruby1.8  RET   thr_kill 0
+thread0 ruby1.8  PSIG  SIGIOT SIG_DFL code=0x10001
 thread1 ruby1.8  RET   poll -1 errno 4 Interrupted system call
-thread0 ruby1.8  RET   sigreturn JUSTRETURN
-thread1 ruby1.8  PSIG  SIG(null) caught handler=0x28188710 mask=0xfffefeef 
code=0x10001
-thread2 ruby1.8  CALL  thr_exit(0x807c1c0)
-thread0 ruby1.8  CALL  write(0x4,0xbfbfdc8c,0x24)
-thread1 ruby1.8  CALL  sigreturn(0x807a850)
-thread0 ruby1.8  GIO   fd 4 wrote 36 bytes
- 0x a01a 3328 0100  0204  44a1 0e00 0060 0828 bc2a 1828 70fd 
1628 d0d5 1728 e0dc 

Bug#673594: ruby1.8: FTBFS[kfreebsd-*]: test-all hangs/segfaults

2012-05-19 Thread Steven Chamberlain
Package: src:ruby1.8
Version: 1.8.7.352-2
Severity: serious
Tags: sid wheezy
User: debian-bsd@lists.debian.org
Usertags: kfreebsd
X-Debbugs-Cc: k...@debian.org
X-Debbugs-Cc: debian-bsd@lists.debian.org
Justification: fails to build from source (but built successfully in the
past)

Hi,

On 20/05/12 01:19, Cyril Brulebois wrote:
 https://buildd.debian.org/status/logs.php?arch=kfreebsd-amd64pkg=ruby1.8ver=1.8.7.358-2

Seems that this issue *rarely* happens during kfreebsd-i386 builds too
(in the same place, but test_safe_04 isn't necessarily at fault).

https://buildd.debian.org/status/fetch.php?pkg=ruby1.8arch=kfreebsd-i386ver=1.8.7.352-2stamp=1313126333
:
 test_safe_04(TestERBCoreWOStrScan): .
 E: Caught signal 'Terminated': terminating immediately
 make[1]: *** [test-all] Terminated
 make: *** [common-post-build-arch] Terminated
 test_cd(TestFileUtils): Build killed with signal TERM after 150 minutes of 
 inactivity

When I try this myself, I hit segfaults in the testsuite before it even
gets that far. :(


The result of the test-all suite is ignored anyway.  Something was added
for ruby1.9.1, to time out any tests that hang -- maybe we could use it
here too:

http://anonscm.debian.org/gitweb/?p=collab-maint/ruby1.9.1.git;a=commitdiff;h=6c64e43924695aec1f995202a032fb2e0e955eb3

Also #593139 might have something relevant to fixing ruby1.8.

Regards,
-- 
Steven Chamberlain
ste...@pyro.eu.org
steven@kfreebsd-i386:~/ruby1.8-1.8.7.358$ gdb ruby1.8 -c ruby1.8.core -s 
debian/libruby1.8-dbg/usr/lib/debug/usr/bin/ruby1.8
GNU gdb (GDB) 7.4.1-debian
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type show copying
and show warranty for details.
This GDB was configured as i486-kfreebsd-gnu.
For bug reporting instructions, please see:
http://www.gnu.org/software/gdb/bugs/...
Reading symbols from /home/steven/ruby1.8-1.8.7.358/ruby1.8...done.
[New process 100385]
[New process 101043]
[New process 101042]
Core was generated by `ruby1.8'.
Program terminated with signal 6, Aborted.
#0  0x282c95f6 in syscall () from /lib/i386-kfreebsd-gnu/i686/cmov/libc.so.0.1
(gdb) thread apply all bt

Thread 3 (process 101042):
#0  0x282c1202 in poll () from /lib/i386-kfreebsd-gnu/i686/cmov/libc.so.0.1
#1  0x281869ee in __pthread_manager () from 
/lib/i386-kfreebsd-gnu/i686/cmov/libpthread.so.0
#2  0x in ?? ()

Thread 2 (process 101043):
#0  0x2818c272 in nanosleep () from 
/lib/i386-kfreebsd-gnu/i686/cmov/libpthread.so.0
#1  0x28187e0f in __pthread_timedsuspend_new_clk () from 
/lib/i386-kfreebsd-gnu/i686/cmov/libpthread.so.0
#2  0x28185bce in pthread_cond_timedwait@GLIBC_2.3 () from 
/lib/i386-kfreebsd-gnu/i686/cmov/libpthread.so.0
#3  0x280967b9 in thread_timer (dummy=0xbfbf81f8) at eval.c:12325
#4  0x28186671 in pthread_start_thread () from 
/lib/i386-kfreebsd-gnu/i686/cmov/libpthread.so.0
#5  0x in ?? ()

Thread 1 (process 100385):
#0  0x282c95f6 in syscall () from /lib/i386-kfreebsd-gnu/i686/cmov/libc.so.0.1
#1  0x2818937b in pthread_kill () from 
/lib/i386-kfreebsd-gnu/i686/cmov/libpthread.so.0
#2  0x281893b6 in raise () from /lib/i386-kfreebsd-gnu/i686/cmov/libpthread.so.0
#3  0x2822e624 in raise () from /lib/i386-kfreebsd-gnu/i686/cmov/libc.so.0.1
#4  0x282316c3 in abort () from /lib/i386-kfreebsd-gnu/i686/cmov/libc.so.0.1
#5  0x28091929 in rb_bug (fmt=fmt@entry=0x28132286 Segmentation fault) at 
error.c:213
#6  0x28100469 in sigsegv (sig=optimized out) at signal.c:634
#7  sigsegv (sig=11) at signal.c:622
#8  0x2818bb47 in __pthread_sighandler () from 
/lib/i386-kfreebsd-gnu/i686/cmov/libpthread.so.0
#9  signal handler called
#10 0x in ?? ()
#11 0x2c10742c in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)