1.  gdb was actually able to unwind the whole stack without any errors. I just 
didn't paste it in the email, since it was a long one.
 2.  Hmm, we don't have asm code, AFAIK. The program ran greatly without any 
problems when I don't link with libunwind. Then default backtrace() was called 
without problems unwinding stacks. So, maybe compiler was doing an okay job 
here?

What is the "n" in UNW_DEBUG_LEVEL=n that you recommend? Does it write to its 
own log file or stderr?

BTW, we really liked libunwind, for two reasons, (1) it runs A LOT faster than 
default backtrace(); (2) it doesn't malloc() when taking stacktraces, so it 
worked greatly with Google's heap profiler. We'd really appreciate it if we can 
continue this debugging to make it work.

Also, we are running a multi-threaded application. Is that okay with libunwind? 
We also do some bfd operations to execute something similar to what addr2line 
does, so we can generate full stacktraces with filename and line numbers. This 
is done potentially at the same time from another thread when libunwind's 
backtrace() is being called. Is that okay? I did spend some time to eliminate 
those bfd calls, while debugging this problem, but there is a chance I might 
have missed one or two places. If you tell me that's not supported, I can 
completely disable those bfd calls to see whether the crash happens again.

Thanks.

-Haiping

On 9/27/09 10:12 PM, "Arun Sharma" <[email protected]> wrote:

On Sat, Sep 26, 2009 at 7:04 PM, Haiping Zhao <[email protected]> wrote:
> Here it is. I still have the core dump, so please let me know if you need
> extra information. Thanks!
>

Looking at the address libunwind was trying to dereference, it looks
like libunwind got either bad or incomplete unwind information.

Couple of questions:

* Was gdb able to unwind the frames below backtrace()? If the unwind
information was bad, neither gdb nor libunwind will be able to unwind.
But gdb has the advantage of being out of process. If it dereferences
a bad pointer, it gets a EFAULT (not SIGSEGV).

* Typically this kind of a problem is the result of:
  * Hand coded asm with missing unwind info
  * Compiler generated bad/incomplete unwind info

To debug the latter case, we'll need to run with UNW_DEBUG_LEVEL=n and
further dump the unwind info (using readelf) and figure out where
things went wrong.

 -Arun

_______________________________________________
Libunwind-devel mailing list
[email protected]
http://lists.nongnu.org/mailman/listinfo/libunwind-devel

Reply via email to