While running one of our internal benchmarks which uses hugepages, we observe the following hang while running out of hugepages and the tasks are hung with:
#0 0x0000040003f5c478 in __lll_lock_wait_private (futex=0x40004000870) at ../nptl/sysdeps/unix/sysv/linux/lowlevellock.c:34 #1 0x0000040003ed3e0c in __libc_malloc (bytes=44) at malloc.c:3657 #2 0x0000040003e80bdc in _nl_make_l10nflist (l10nfile_list=0x40003fffca8, dirlist=0x40003faf040 "/usr/share/locale", dirlist_len=18, mask=<value optimized out>, language=0xffff067ae10 "en_US", territory=0x0, codeset=0x0, normalized_codeset=0x0, modifier=0x0, filename=0xffff067ae30 "LC_MESSAGES/libc.mo", do_allocate=0) at l10nflist.c:193 #3 0x0000040003e7e8c8 in _nl_find_domain (dirname=0x40003faf040 "/usr/share/locale", locale=0xffff067ae10 "en_US", domainname=0xffff067ae30 "LC_MESSAGES/libc.mo", domainbinding=0x0) at finddomain.c:88 #4 0x0000040003e7e104 in __dcigettext (domainname=0x40003faef58 "libc", msgid1=0x40003faf6a8 "Cannot allocate memory", msgid2=0x0, plural=<value optimized out>, n=0, category=<value optimized out>) at dcigettext.c:628 #5 0x0000040003e7ca10 in __dcgettext (domainname=<value optimized out>, msgid=<value optimized out>, category=<value optimized out>) at dcgettext.c:53 #6 0x0000040003ed96d4 in __strerror_r (errnum=<value optimized out>, buf=0x0, buflen=0) at _strerror.c:65 #7 0x0000040003ed95cc in strerror (errnum=<value optimized out>) at strerror.c:33 #8 0x000004000008d60c in ?? () from /usr/lib64/libhugetlbfs.so #9 0x0000040003ed3338 in sYSMALLOc (av=0x40004000870, bytes=237753176) at malloc.c:3197 #10 _int_malloc (av=0x40004000870, bytes=237753176) at malloc.c:4747 #11 0x0000040003ed3c3c in __libc_malloc (bytes=237753176) at malloc.c:3660 #12 0x000004000055a9ac in .sftcr3d () from /usr/lib64/libpesslsmp.so.1 #13 0x0000040000457b9c in .pscrft3 () from /usr/lib64/libpesslsmp.so.1 #14 0x0000000010003a18 in .run_parallel () #15 0x0000000010002990 in .main () libhugeltbfs used is 2.12-2.el6/gcc-4.4.6-3.el6. It looks like a deadlock with malloc. When glibc calls libhugetlbfs for more hugepage allocation and when it fails to allocate more hugepages strerror is called, which inturns tries to allocate memory. Eventually deadlocks which trying to acquire the lock. After some googling, found similar bug been reported against glibc at http://sourceware.org/bugzilla/show_bug.cgi?id=13699. Any thoughts on what might be causing the task hung ? Thanks, Kamalesh. ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ Libhugetlbfs-devel mailing list Libhugetlbfs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/libhugetlbfs-devel