Dear Haskell maintainers, I've progressed a little and found that the problem is down to accessing global variables that are declared in dynamic libraries. In a nutshell, this doesn't as the addresses of these global variables are all wrong when ghci is executing the code. So, I think I hit:
http://hackage.haskell.org/trac/ghc/ticket/781 I was able to work around this problem by compiling the C modules with -fPIC. This bug is pretty bad, I'd say. I've added myself to its CC list. Cheers, Axel On 14.07.2010, at 16:51, Axel Simon wrote: > Hi all, > > I'm trying to debug a segfault relating to the memory management in > Gtk2Hs. Rather than make you read the ticket > http://hackage.haskell.org/trac/gtk2hs/ticket/1183 > , I'll describe the problem: > > - compiler 6.12.1 or 6.12.3 > - darcs head of Gtk2Hs with #define DEBUG instead of #undef DEBUG in > gtk/Graphics/UI/Gtk/General/hsthread.c > - platform Ubuntu Linux, x86-64 > - to reproduce: cd gtk2hs/gtk/demo/hello and run ghci World.hs and > type 'main' > > A window with the "Hello World" button appears. After a few seconds, > the GC runs and the finaliser of the GtkButton is run since the > Haskell program no longer holds a reference to that object (only the > GtkWindow in C land has). > > Thus, the GC calls a C function gtk2hs_g_object_unref_from_mainloop > which is supposed to enqueue the object into a global data structure > from which objects are later taken and g_object_unref is called on > them. > > This global data structure is protected by a mutex, which is > acquired using g_static_mutex_lock: > > void gtk2hs_g_object_unref_from_mainloop(gpointer object) { > > int mutex_locked = 0; > if (threads_initialised) { > #ifdef DEBUG > printf("acquiring lock to add a %s object at %lx\n", > g_type_name(G_OBJECT_TYPE(object)), (unsigned long) > object); > printf("value of lock function is %lx\n", > (unsigned long) > g_thread_functions_for_glib_use.mutex_lock); > #endif > g_rand_new(); > #if defined( WIN32 ) > EnterCriticalSection(>k2hs_finalizer_mutex); > #else > g_static_mutex_lock(>k2hs_finalizer_mutex); > #endif > mutex_locked = 1; > } > [..] > > The program prints: > > acquiring lock to add a GtkButton object at 22d8020 > value of lock function is 0 > zsh: segmentation fault ghci World > > Now the debugging weirdness starts. Whatever I do, I cannot get gdb > to find the symbol gtk2hs_g_object_unref_from_mainloop. > > Since the function above is contained in a C file that comes with > our Haskell library, I tried to add "cc-options: -g" and "cc- > options: -ggdb -O0", but maybe somewhere symbols are stripped. So I > added the bogus function call to "g_rand_new()" which is not called > anywhere else and gdb stops as follows: > > acquiring lock to add a GtkButton object at 2105020 > value of lock function is 0 > [Switching to Thread 0x7ffff41ff710 (LWP 15735)] > > Breakpoint 12, 0x00007ffff115bfa0 in g_rand_new () from /usr/lib/ > libglib-2.0.so > > This all seems reasonable, but: > > (gdb) bt > #0 0x00007ffff115bfa0 in g_rand_new () from /usr/lib/libglib-2.0.so > #1 0x00000000419b3792 in ?? () > #2 0x00007ffff678f078 in ?? () > > i.e. the calling context is broken. I'm very, very sure that the > caller is indeed the above mentioned function and since g_rand_new > isn't called anywhere in my Haskell program (and otherwise the > calling context would be sane). > I'm also passing the address of gtk2hs_g_object_unref_from_mainloop > as FinalizerPtr to all my ForeignPtrs, so there is no inlining going > on. > > Back to the culprit, the call to g_static_mutex_lock. This is a > macro that expands to > > *g_thread_functions_for_glib_use.mutex_lock > > where g_thread_functions_for_glib is a global variable that contains > a lot of function pointers. At the break point, it contains this: > > (gdb) print g_thread_functions_for_glib_use > $33 = {mutex_new = 0x7ffff0cd9820 <g_mutex_new_posix_impl>, > mutex_lock = 0x7ffff6c8b3c0 <__pthread_mutex_lock>, > mutex_trylock = 0x7ffff0cd97b0 <g_mutex_trylock_posix_impl>, > mutex_unlock = 0x7ffff6c8ca00 <__pthread_mutex_unlock>, > mutex_free = 0x7ffff0cd9740 <g_mutex_free_posix_impl>, > [..] > > So the call to g_mutex_lock should call the function > __pthread_mutex_lock but it calls NULL. > > I hoped that writing this email would give me a bit more insight > into the problem, but for now I suspect that something overwrites > either the stack or the code of the function. > > On the same platform, the compiled version prints: > > acquiring lock to add a GtkButton object at 1b05820 > value of lock function is 7f7adcabd3c0 > within mutex: adding finalizer to a GtkButton object! > > On Mac OS or i386, using ghci or ghc, version 6.10.4, it works as > well. > Now for the fun bit: on i386 using ghci version 6.12.1 it works too. > > So it's an x86-64 and ghc 6.12.1 bug. According to Christian Maeder > who submitted the ticket, the problem persists in 6.12.3. > > Any hints and help appreciated, > Cheers, > Axel > > > > > > > > _______________________________________________ > Glasgow-haskell-users mailing list > glasgow-haskell-us...@haskell.org > http://www.haskell.org/mailman/listinfo/glasgow-haskell-users ------------------------------------------------------------------------------ This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first _______________________________________________ Gtk2hs-devel mailing list Gtk2hs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/gtk2hs-devel