On Friday, 10 May 2013 at 23:29:33 UTC, H. S. Teoh wrote:
It turns out that this mysterious "stuck" state was caused by the stack trace code -- but not in any of the usual ways. In order to produce the trace, it uses fprintf to write info to the log, and fprintf in turn calls malloc at various points to allocate the necessary buffers to do that. Now, if for some reason free() segfaults (e.g., you pass in an illegal pointer), then libc is still holding the internal malloc mutex lock when the OS sends the SEGV to the process, so when the stack trace handler then calls fprintf, which in turn calls malloc, it deadlocks. Further SIGSEGV's won't help, since it only makes the deadlock worse.
This is the very reason why the NullPointerError handler build a fake stack frame and hijack the EIP register in order to NOT do that kind of stuff into the signal handler.
This is very confusing and must be put into some runtime code and never used directly by users.
