Re: Passenger hangs on live and SEGV on tests possible threading / kernel bug?

2009-12-21 Thread John Baldwin
On Thursday 17 December 2009 12:27:17 pm Steven Hartland wrote:
 - Original Message - 
 From: John Baldwin j...@freebsd.org
  For the hang it seems you have a thread waiting in a blocking read(), a 
  thread 
  waiting in a blocking accept(), and lots of threads creating condition 
  variables.  However, the pthread_cond_init() in libpthread (libthr on 
  FreeBSD) 
  doesn't call pthread_cleanup_push(), so your stack trace doesn't make sense 
  to 
  me.  However, that may be gdb getting confused.  The pthread_cleanup_push() 
  frame may be cond_init().  However, it doesn't call umtx_op() (the 
  _thr_umutex_init() call it makes just initializes the structure, it doesn't 
  make a _umtx_op() system call).  You might try posting on threads@ to try 
  to 
  get more info on this, but your pthread_cond_init() stack traces don't 
  really 
  make sense.  Can you rebuild libc and libthr with debug symbols?
  
  For example:
  
  # cd /usr/src/lib/libc
  # make clean 
  # make DEBUG_FLAGS=-g
  # make DEBUG_FLAGS=-g install
  
  However, if you are hanging in read(), that usually means you have a socket 
  that just doesn't have data.  That might be an application bug of some sort.
  
  The segv trace doesn't include the first part of GDB messages which show 
  which 
  thread actually had a seg fault.  It looks like it was the thread that was 
  throwing an exception.  However, nanosleep() doesn't throw exceptions, so 
  that 
  stack trace doesn't really make sense either.  Perhaps that stack is hosed 
  by 
  the exception handling code?
 
 I've uploaded a two more traces for the oxt test failure / segv.
 http://code.google.com/p/phusion-passenger/issues/detail?id=441#c1
 
 From looking at the test case it testing the capture of failures and its 
 ability
 to create a stack trace output so that may give others some indication where
 the issue may be?
 
 I will look to do the same on for the hang issue but that's on a live site so
 will need to schedule some downtime before I can get those rebuilt and then
 wait for it to hang again, which could be quite some time :(

Hmmm, the only seg fault I see is happening down inside libgcc in the stack
unwinding code and that is 3rd party code from gcc.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Passenger hangs on live and SEGV on tests possible threading /kernel bug?

2009-12-21 Thread Steven Hartland
- Original Message - 
From: John Baldwin 

I've uploaded a two more traces for the oxt test failure / segv.
http://code.google.com/p/phusion-passenger/issues/detail?id=441#c1

From looking at the test case it testing the capture of failures and its 
ability
to create a stack trace output so that may give others some indication where
the issue may be?

I will look to do the same on for the hang issue but that's on a live site so
will need to schedule some downtime before I can get those rebuilt and then
wait for it to hang again, which could be quite some time :(


Hmmm, the only seg fault I see is happening down inside libgcc in the stack
unwinding code and that is 3rd party code from gcc.


Thanks for looking John, so you believe this may be an issue with the gcc code?

What would be the next step on this, raise it on a gcc mail list or something?

   Regards
   Steve


This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 


In the event of misdirection, illegible or incomplete transmission please 
telephone +44 845 868 1337
or return the E.mail to postmas...@multiplay.co.uk.

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Passenger hangs on live and SEGV on tests possible threading /kernel bug?

2009-12-21 Thread John Baldwin
On Monday 21 December 2009 9:45:53 am Steven Hartland wrote:
 - Original Message - 
 From: John Baldwin 
  I've uploaded a two more traces for the oxt test failure / segv.
  http://code.google.com/p/phusion-passenger/issues/detail?id=441#c1
  
  From looking at the test case it testing the capture of failures and its 
  ability
  to create a stack trace output so that may give others some indication 
  where
  the issue may be?
  
  I will look to do the same on for the hang issue but that's on a live site 
  so
  will need to schedule some downtime before I can get those rebuilt and then
  wait for it to hang again, which could be quite some time :(
  
  Hmmm, the only seg fault I see is happening down inside libgcc in the stack
  unwinding code and that is 3rd party code from gcc.
 
 Thanks for looking John, so you believe this may be an issue with the gcc 
 code?
 
 What would be the next step on this, raise it on a gcc mail list or something?

I'm not sure. :)  That may be best.  You could also try examining the
registers and assembly to see if you can figure out more of what is going on
when it dies.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Passenger hangs on live and SEGV on tests possible threading / kernel bug?

2009-12-17 Thread Steven Hartland

We're having an issue with Passenger on FreeBSD where it will hang
and stop processing any more requests the details are attach to
the following bug report:
http://code.google.com/p/phusion-passenger/issues/detail?id=318#c14

In addition the test suite crashes in what seems to be a very
basic test, which I'm at a loss with.
http://code.google.com/p/phusion-passenger/issues/detail?id=441

I'm thinking this may be a bugs in the FreeBSD either kernel or
thread library as the crashes don't make any sense from the
application side.

Any advise on debugging or feedback on the stack traces would
be much appreciated.

   Regards
   Steve


This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 


In the event of misdirection, illegible or incomplete transmission please 
telephone +44 845 868 1337
or return the E.mail to postmas...@multiplay.co.uk.

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Passenger hangs on live and SEGV on tests possible threading / kernel bug?

2009-12-17 Thread John Baldwin
On Thursday 17 December 2009 6:12:07 am Steven Hartland wrote:
 We're having an issue with Passenger on FreeBSD where it will hang
 and stop processing any more requests the details are attach to
 the following bug report:
 http://code.google.com/p/phusion-passenger/issues/detail?id=318#c14
 
 In addition the test suite crashes in what seems to be a very
 basic test, which I'm at a loss with.
 http://code.google.com/p/phusion-passenger/issues/detail?id=441
 
 I'm thinking this may be a bugs in the FreeBSD either kernel or
 thread library as the crashes don't make any sense from the
 application side.
 
 Any advise on debugging or feedback on the stack traces would
 be much appreciated.

For the hang it seems you have a thread waiting in a blocking read(), a thread 
waiting in a blocking accept(), and lots of threads creating condition 
variables.  However, the pthread_cond_init() in libpthread (libthr on FreeBSD) 
doesn't call pthread_cleanup_push(), so your stack trace doesn't make sense to 
me.  However, that may be gdb getting confused.  The pthread_cleanup_push() 
frame may be cond_init().  However, it doesn't call umtx_op() (the 
_thr_umutex_init() call it makes just initializes the structure, it doesn't 
make a _umtx_op() system call).  You might try posting on threads@ to try to 
get more info on this, but your pthread_cond_init() stack traces don't really 
make sense.  Can you rebuild libc and libthr with debug symbols?

For example:

# cd /usr/src/lib/libc
# make clean 
# make DEBUG_FLAGS=-g
# make DEBUG_FLAGS=-g install

However, if you are hanging in read(), that usually means you have a socket 
that just doesn't have data.  That might be an application bug of some sort.

The segv trace doesn't include the first part of GDB messages which show which 
thread actually had a seg fault.  It looks like it was the thread that was 
throwing an exception.  However, nanosleep() doesn't throw exceptions, so that 
stack trace doesn't really make sense either.  Perhaps that stack is hosed by 
the exception handling code?

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Passenger hangs on live and SEGV on tests possible threading / kernel bug?

2009-12-17 Thread Daniel Eischen

On Thu, 17 Dec 2009, John Baldwin wrote:


On Thursday 17 December 2009 6:12:07 am Steven Hartland wrote:

We're having an issue with Passenger on FreeBSD where it will hang
and stop processing any more requests the details are attach to
the following bug report:
http://code.google.com/p/phusion-passenger/issues/detail?id=318#c14

In addition the test suite crashes in what seems to be a very
basic test, which I'm at a loss with.
http://code.google.com/p/phusion-passenger/issues/detail?id=441

I'm thinking this may be a bugs in the FreeBSD either kernel or
thread library as the crashes don't make any sense from the
application side.

Any advise on debugging or feedback on the stack traces would
be much appreciated.


For the hang it seems you have a thread waiting in a blocking read(), a thread
waiting in a blocking accept(), and lots of threads creating condition
variables.  However, the pthread_cond_init() in libpthread (libthr on FreeBSD)
doesn't call pthread_cleanup_push(), so your stack trace doesn't make sense to
me.  However, that may be gdb getting confused.  The pthread_cleanup_push()
frame may be cond_init().  However, it doesn't call umtx_op() (the
_thr_umutex_init() call it makes just initializes the structure, it doesn't
make a _umtx_op() system call).  You might try posting on threads@ to try to
get more info on this, but your pthread_cond_init() stack traces don't really
make sense.  Can you rebuild libc and libthr with debug symbols?


Yes, good advice, I have noticed that you can't trust GDB stack
traces unless libc and libthr have been built with debug (-g)
enabled.

--
DE
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Passenger hangs on live and SEGV on tests possible threading / kernel bug?

2009-12-17 Thread Steven Hartland
- Original Message - 
From: John Baldwin j...@freebsd.org
For the hang it seems you have a thread waiting in a blocking read(), a thread 
waiting in a blocking accept(), and lots of threads creating condition 
variables.  However, the pthread_cond_init() in libpthread (libthr on FreeBSD) 
doesn't call pthread_cleanup_push(), so your stack trace doesn't make sense to 
me.  However, that may be gdb getting confused.  The pthread_cleanup_push() 
frame may be cond_init().  However, it doesn't call umtx_op() (the 
_thr_umutex_init() call it makes just initializes the structure, it doesn't 
make a _umtx_op() system call).  You might try posting on threads@ to try to 
get more info on this, but your pthread_cond_init() stack traces don't really 
make sense.  Can you rebuild libc and libthr with debug symbols?


For example:

# cd /usr/src/lib/libc
# make clean 
# make DEBUG_FLAGS=-g

# make DEBUG_FLAGS=-g install

However, if you are hanging in read(), that usually means you have a socket 
that just doesn't have data.  That might be an application bug of some sort.


The segv trace doesn't include the first part of GDB messages which show which 
thread actually had a seg fault.  It looks like it was the thread that was 
throwing an exception.  However, nanosleep() doesn't throw exceptions, so that 
stack trace doesn't really make sense either.  Perhaps that stack is hosed by 
the exception handling code?


I've uploaded a two more traces for the oxt test failure / segv.
http://code.google.com/p/phusion-passenger/issues/detail?id=441#c1


From looking at the test case it testing the capture of failures and its ability

to create a stack trace output so that may give others some indication where
the issue may be?

I will look to do the same on for the hang issue but that's on a live site so
will need to schedule some downtime before I can get those rebuilt and then
wait for it to hang again, which could be quite some time :(

   Regards
   Steve


This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 


In the event of misdirection, illegible or incomplete transmission please 
telephone +44 845 868 1337
or return the E.mail to postmas...@multiplay.co.uk.

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org