Re: [Q] Strange hang in jemalloc

Jason Evans Wed, 23 Oct 2013 23:25:53 -0700

On Oct 23, 2013, at 8:11 PM, Taehwan Weon <[email protected]> wrote:
> I am using jemalloc-3.4.0 on Centos 6.3
> When my SEGV signal handler tried to dump call stacks, hang occurred as 
> following.
> I don't know why libc's fork called jemalloc_prefork even if I didn't set 
> LD_PRELOAD.
> 
> #0  0x000000351d60d654 in __lll_lock_wait () from /lib64/libpthread.so.0
> #1  0x000000351d608f4a in _L_lock_1034 () from /lib64/libpthread.so.0
> #2  0x000000351d608e0c in pthread_mutex_lock () from /lib64/libpthread.so.0
> #3  0x00002b59a8055d6d in malloc_mutex_lock (mutex=0x2b59a84a4320) at 
> include/jemalloc/internal/mutex.h:77
> #4  malloc_mutex_prefork (mutex=0x2b59a84a4320) at src/mutex.c:109
> #5  0x00002b59a8044c32 in arena_prefork (arena=0x2b59a84a3d40) at 
> src/arena.c:2344
> #6  0x00002b59a803f555 in jemalloc_prefork () at src/jemalloc.c:1760
> #7  0x000000351ca9a2a6 in fork () from /lib64/libc.so.6
> #8  0x000000351ca6200d in _IO_proc_open@@GLIBC_2.2.5 () from /lib64/libc.so.6
> #9  0x000000351ca62269 in popen@@GLIBC_2.2.5 () from /lib64/libc.so.6
> #10 0x00002b59a71bc1f9 in backtrace_lineinfo (number=1, address=<value 
> optimized out>, symbol=0x2b61f4000918 "/usr/lib64/libnc.so.2 
> [0x2b59a71bc3b1]") at cfs_apix.c:363
> #11 0x00002b59a71bc3ff in nc_dump_stack (sig=<value optimized out>) at 
> cfs_apix.c:423
> #12 <signal handler called>
> #13 0x00002b59a8047332 in arena_dalloc_bin_locked (arena=0x2b59a84a3d40, 
> chunk=0x2b61ef000000, ptr=<value optimized out>, mapelm=<value optimized 
> out>) at src/arena.c:1717
> #14 0x00002b59a805fba4 in tcache_bin_flush_small (tbin=0x2b61ef107128, 
> binind=8, rem=9, tcache=0x2b61ef107000) at src/tcache.c:127
> #15 0x00002b59a805fcd4 in tcache_event_hard (tcache=0x80) at src/tcache.c:39
> #16 0x00002b59a80428d9 in tcache_event (ptr=0x2b61efd26dc0) at 
> include/jemalloc/internal/tcache.h:271
> #17 tcache_dalloc_small (ptr=0x2b61efd26dc0) at 
> include/jemalloc/internal/tcache.h:408
> #18 arena_dalloc (ptr=0x2b61efd26dc0) at 
> include/jemalloc/internal/arena.h:1003
> #19 idallocx (ptr=0x2b61efd26dc0) at 
> include/jemalloc/internal/jemalloc_internal.h:913
> #20 iqallocx (ptr=0x2b61efd26dc0) at 
> include/jemalloc/internal/jemalloc_internal.h:932
> #21 iqalloc (ptr=0x2b61efd26dc0) at 
> include/jemalloc/internal/jemalloc_internal.h:939
> #22 jefree (ptr=0x2b61efd26dc0) at src/jemalloc.c:1272
> #23 0x00002b59a71c958b in __nc_free (p=0x2b61efd26dd0, file=<value optimized 
> out>, lno=<value optimized out>) at util.c:1916
> #24 0x00002b59a71cb975 in tlcq_dequeue (q=0x2b59a9ee3210, msec=<value 
> optimized out>) at tlc_queue.c:215
> #25 0x00002b59a71c154b in tp_worker (d=<value optimized out>) at 
> threadpool.c:116
> #26 0x000000351d60683d in start_thread () from /lib64/libpthread.so.0
> #27 0x000000351cad503d in clone () from /lib64/libc.so.6
> 
> Any hint will be highly appreciated.


jemalloc calls pthread_atfork(3) in order to install functions that get called 
just before and after fork(2).  In this case your application is causing a 
signal while deep inside jemalloc (very likely due to memory corruption), with 
locks already acquired.  Deadlock obviously results.  To my surprise, fork() is 
actually listed in the signal(7) manual page as an async-signal-safe function, 
though popen(3) isn't, and it would probably allocate memory if it got past the 
hang during fork().  If you were to call fork() directly, then you'd be hitting 
a peculiar failure condition and we could make a case for jemalloc's behavior 
being questionable, but as your signal handler is written, it is simply 
unreliable due to calling something outside the list of async-signal-safe 
functions.

Jason

_______________________________________________
jemalloc-discuss mailing list
[email protected]
http://www.canonware.com/mailman/listinfo/jemalloc-discuss

Re: [Q] Strange hang in jemalloc

Reply via email to