Re: Passenger hangs on live and SEGV on tests possible threading / kernel bug?
On Thursday 17 December 2009 12:27:17 pm Steven Hartland wrote: - Original Message - From: John Baldwin j...@freebsd.org For the hang it seems you have a thread waiting in a blocking read(), a thread waiting in a blocking accept(), and lots of threads creating condition variables. However, the pthread_cond_init() in libpthread (libthr on FreeBSD) doesn't call pthread_cleanup_push(), so your stack trace doesn't make sense to me. However, that may be gdb getting confused. The pthread_cleanup_push() frame may be cond_init(). However, it doesn't call umtx_op() (the _thr_umutex_init() call it makes just initializes the structure, it doesn't make a _umtx_op() system call). You might try posting on threads@ to try to get more info on this, but your pthread_cond_init() stack traces don't really make sense. Can you rebuild libc and libthr with debug symbols? For example: # cd /usr/src/lib/libc # make clean # make DEBUG_FLAGS=-g # make DEBUG_FLAGS=-g install However, if you are hanging in read(), that usually means you have a socket that just doesn't have data. That might be an application bug of some sort. The segv trace doesn't include the first part of GDB messages which show which thread actually had a seg fault. It looks like it was the thread that was throwing an exception. However, nanosleep() doesn't throw exceptions, so that stack trace doesn't really make sense either. Perhaps that stack is hosed by the exception handling code? I've uploaded a two more traces for the oxt test failure / segv. http://code.google.com/p/phusion-passenger/issues/detail?id=441#c1 From looking at the test case it testing the capture of failures and its ability to create a stack trace output so that may give others some indication where the issue may be? I will look to do the same on for the hang issue but that's on a live site so will need to schedule some downtime before I can get those rebuilt and then wait for it to hang again, which could be quite some time :( Hmmm, the only seg fault I see is happening down inside libgcc in the stack unwinding code and that is 3rd party code from gcc. -- John Baldwin ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Passenger hangs on live and SEGV on tests possible threading /kernel bug?
- Original Message - From: John Baldwin I've uploaded a two more traces for the oxt test failure / segv. http://code.google.com/p/phusion-passenger/issues/detail?id=441#c1 From looking at the test case it testing the capture of failures and its ability to create a stack trace output so that may give others some indication where the issue may be? I will look to do the same on for the hang issue but that's on a live site so will need to schedule some downtime before I can get those rebuilt and then wait for it to hang again, which could be quite some time :( Hmmm, the only seg fault I see is happening down inside libgcc in the stack unwinding code and that is 3rd party code from gcc. Thanks for looking John, so you believe this may be an issue with the gcc code? What would be the next step on this, raise it on a gcc mail list or something? Regards Steve This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmas...@multiplay.co.uk. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Passenger hangs on live and SEGV on tests possible threading /kernel bug?
On Monday 21 December 2009 9:45:53 am Steven Hartland wrote: - Original Message - From: John Baldwin I've uploaded a two more traces for the oxt test failure / segv. http://code.google.com/p/phusion-passenger/issues/detail?id=441#c1 From looking at the test case it testing the capture of failures and its ability to create a stack trace output so that may give others some indication where the issue may be? I will look to do the same on for the hang issue but that's on a live site so will need to schedule some downtime before I can get those rebuilt and then wait for it to hang again, which could be quite some time :( Hmmm, the only seg fault I see is happening down inside libgcc in the stack unwinding code and that is 3rd party code from gcc. Thanks for looking John, so you believe this may be an issue with the gcc code? What would be the next step on this, raise it on a gcc mail list or something? I'm not sure. :) That may be best. You could also try examining the registers and assembly to see if you can figure out more of what is going on when it dies. -- John Baldwin ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Passenger hangs on live and SEGV on tests possible threading / kernel bug?
We're having an issue with Passenger on FreeBSD where it will hang and stop processing any more requests the details are attach to the following bug report: http://code.google.com/p/phusion-passenger/issues/detail?id=318#c14 In addition the test suite crashes in what seems to be a very basic test, which I'm at a loss with. http://code.google.com/p/phusion-passenger/issues/detail?id=441 I'm thinking this may be a bugs in the FreeBSD either kernel or thread library as the crashes don't make any sense from the application side. Any advise on debugging or feedback on the stack traces would be much appreciated. Regards Steve This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmas...@multiplay.co.uk. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Passenger hangs on live and SEGV on tests possible threading / kernel bug?
On Thursday 17 December 2009 6:12:07 am Steven Hartland wrote: We're having an issue with Passenger on FreeBSD where it will hang and stop processing any more requests the details are attach to the following bug report: http://code.google.com/p/phusion-passenger/issues/detail?id=318#c14 In addition the test suite crashes in what seems to be a very basic test, which I'm at a loss with. http://code.google.com/p/phusion-passenger/issues/detail?id=441 I'm thinking this may be a bugs in the FreeBSD either kernel or thread library as the crashes don't make any sense from the application side. Any advise on debugging or feedback on the stack traces would be much appreciated. For the hang it seems you have a thread waiting in a blocking read(), a thread waiting in a blocking accept(), and lots of threads creating condition variables. However, the pthread_cond_init() in libpthread (libthr on FreeBSD) doesn't call pthread_cleanup_push(), so your stack trace doesn't make sense to me. However, that may be gdb getting confused. The pthread_cleanup_push() frame may be cond_init(). However, it doesn't call umtx_op() (the _thr_umutex_init() call it makes just initializes the structure, it doesn't make a _umtx_op() system call). You might try posting on threads@ to try to get more info on this, but your pthread_cond_init() stack traces don't really make sense. Can you rebuild libc and libthr with debug symbols? For example: # cd /usr/src/lib/libc # make clean # make DEBUG_FLAGS=-g # make DEBUG_FLAGS=-g install However, if you are hanging in read(), that usually means you have a socket that just doesn't have data. That might be an application bug of some sort. The segv trace doesn't include the first part of GDB messages which show which thread actually had a seg fault. It looks like it was the thread that was throwing an exception. However, nanosleep() doesn't throw exceptions, so that stack trace doesn't really make sense either. Perhaps that stack is hosed by the exception handling code? -- John Baldwin ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Passenger hangs on live and SEGV on tests possible threading / kernel bug?
On Thu, 17 Dec 2009, John Baldwin wrote: On Thursday 17 December 2009 6:12:07 am Steven Hartland wrote: We're having an issue with Passenger on FreeBSD where it will hang and stop processing any more requests the details are attach to the following bug report: http://code.google.com/p/phusion-passenger/issues/detail?id=318#c14 In addition the test suite crashes in what seems to be a very basic test, which I'm at a loss with. http://code.google.com/p/phusion-passenger/issues/detail?id=441 I'm thinking this may be a bugs in the FreeBSD either kernel or thread library as the crashes don't make any sense from the application side. Any advise on debugging or feedback on the stack traces would be much appreciated. For the hang it seems you have a thread waiting in a blocking read(), a thread waiting in a blocking accept(), and lots of threads creating condition variables. However, the pthread_cond_init() in libpthread (libthr on FreeBSD) doesn't call pthread_cleanup_push(), so your stack trace doesn't make sense to me. However, that may be gdb getting confused. The pthread_cleanup_push() frame may be cond_init(). However, it doesn't call umtx_op() (the _thr_umutex_init() call it makes just initializes the structure, it doesn't make a _umtx_op() system call). You might try posting on threads@ to try to get more info on this, but your pthread_cond_init() stack traces don't really make sense. Can you rebuild libc and libthr with debug symbols? Yes, good advice, I have noticed that you can't trust GDB stack traces unless libc and libthr have been built with debug (-g) enabled. -- DE ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Passenger hangs on live and SEGV on tests possible threading / kernel bug?
- Original Message - From: John Baldwin j...@freebsd.org For the hang it seems you have a thread waiting in a blocking read(), a thread waiting in a blocking accept(), and lots of threads creating condition variables. However, the pthread_cond_init() in libpthread (libthr on FreeBSD) doesn't call pthread_cleanup_push(), so your stack trace doesn't make sense to me. However, that may be gdb getting confused. The pthread_cleanup_push() frame may be cond_init(). However, it doesn't call umtx_op() (the _thr_umutex_init() call it makes just initializes the structure, it doesn't make a _umtx_op() system call). You might try posting on threads@ to try to get more info on this, but your pthread_cond_init() stack traces don't really make sense. Can you rebuild libc and libthr with debug symbols? For example: # cd /usr/src/lib/libc # make clean # make DEBUG_FLAGS=-g # make DEBUG_FLAGS=-g install However, if you are hanging in read(), that usually means you have a socket that just doesn't have data. That might be an application bug of some sort. The segv trace doesn't include the first part of GDB messages which show which thread actually had a seg fault. It looks like it was the thread that was throwing an exception. However, nanosleep() doesn't throw exceptions, so that stack trace doesn't really make sense either. Perhaps that stack is hosed by the exception handling code? I've uploaded a two more traces for the oxt test failure / segv. http://code.google.com/p/phusion-passenger/issues/detail?id=441#c1 From looking at the test case it testing the capture of failures and its ability to create a stack trace output so that may give others some indication where the issue may be? I will look to do the same on for the hang issue but that's on a live site so will need to schedule some downtime before I can get those rebuilt and then wait for it to hang again, which could be quite some time :( Regards Steve This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmas...@multiplay.co.uk. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org