On Jul 9, 2014, at 6:52 PM, Eduardo Silva <edsi...@gmail.com> wrote: > On Wed, Jul 9, 2014 at 10:19 AM, Eduardo Silva <edsi...@gmail.com> wrote: >> On Wed, Jul 9, 2014 at 8:44 AM, Jason Evans <jas...@canonware.com> wrote: >>> On Jul 8, 2014, at 1:28 PM, Eduardo Silva <edsi...@gmail.com> wrote: >>>> i am using jemalloc as part of our web services framework stack and >>>> running on high loads (after every 6 hours of work) i find common >>>> segfaults like the one described here. >>>> >>>> It was triggered on je_arena_dalloc_bin_locked(..). Do you have some >>>> idea that what can be causing the problem ? >>>> >>>> (gdb) bt >>>> #0 0x00007f50eab23425 in __GI_raise (sig=<optimized out>) at >>>> ../nptl/sysdeps/unix/sysv/linux/raise.c:64 >>>> #1 0x00007f50eab26b8b in __GI_abort () at abort.c:91 >>>> #2 0x000000000040d232 in mk_signal_handler (signo=11, >>>> si=0x7f50de7f96f0, context=0x7f50de7f95c0) at mk_signals.c:108 >>>> #3 <signal handler called> >>>> #4 je_arena_dalloc_bin_locked (arena=0x7f50ea409240, >>>> chunk=0x7f50e4c00000, ptr=<optimized out>, mapelm=<optimized out>) at >>>> src/arena.c:1897 >>> >>> This looks like a crash due to a double-freed region being flushed from the >>> thread cache. You may be able to find the actual source of the problem if >>> you use a debug build of jemalloc and disable thread caching >>> (MALLOC_CONF=tcache:false). >> >> thanks, working on that. > > I saw in the program output the following: > > <jemalloc>: include/jemalloc/internal/arena.h:776: Failed assertion: > "binind == actual_binind" > > looking at the backtrace: > > #0 0x00007fc1031aa425 in __GI_raise (sig=<optimized out>) at > ../nptl/sysdeps/unix/sysv/linux/raise.c:64 > #1 0x00007fc1031adb8b in __GI_abort () at abort.c:91 > #2 0x00000000004256ca in je_arena_ptr_small_binind_get > (ptr=<optimized out>, mapbits=<optimized out>) at > include/jemalloc/internal/arena.h:764 > #3 0x00000000004259f5 in je_arena_salloc (ptr=<optimized out>, > demote=<optimized out>) at include/jemalloc/internal/arena.h:1015 > #4 0x000000000041947d in je_isalloc (demote=false, > ptr=0x7fc0fd4173d0) at > include/jemalloc/internal/jemalloc_internal.h:849 > #5 ifree (ptr=0x7fc0fd4173d0) at src/jemalloc.c:1228 > #6 0x00007fc1025f1f6f in mk_mem_free (ptr=0x7fc0fd4173d0) at > ../../../src/include/mk_memory.h:98 > > it happened when releasing some memory...
A double free is still a possibility, but that particular failure mode moves buffer overflow further up on the list of possibilities. If actual_binind seems plausible but binind does not, then somehow the chunk’s page map got corrupted, for example by double-freeing a large object. If on the other hand binind seems plausible but actual_binind does not, then a buffer overflow may have corrupted the run header. Jason _______________________________________________ jemalloc-discuss mailing list jemalloc-discuss@canonware.com http://www.canonware.com/mailman/listinfo/jemalloc-discuss