[ https://issues.apache.org/jira/browse/IMPALA-10088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
zhaorenhai resolved IMPALA-10088. --------------------------------- Assignee: zhaorenhai Resolution: Fixed > DeadLock while run unifiedbetests on aarch64 platform > ----------------------------------------------------- > > Key: IMPALA-10088 > URL: https://issues.apache.org/jira/browse/IMPALA-10088 > Project: IMPALA > Issue Type: Sub-task > Reporter: zhaorenhai > Assignee: zhaorenhai > Priority: Major > > When run unifiedbetests and impalad on aarch64 platform, when init tcmalloc, > will happen deadlock. > The stacktrace is as following: > > {code:java} > (gdb) bt > #0 0x0000ffff83099544 in __GI___nanosleep (requested_time=0xffffffc71698, > remaining=0x0) at ../sysdeps/unix/sysv/linux/nanosleep.c:28 > #1 0x00000000054cf144 in base::internal::SpinLockDelay (w=0x77385b0 > <tcmalloc::Static::pageheap_lock_>, value=2, loop=727956) at > /home/impala/impala/be/src/gutil/spinlock_linux-inl.h:86 > #2 0x0000000005529800 in SpinLock::SlowLock() () > #3 0x00000000055fb5c4 in tcmalloc::ThreadCache::InitModule() () > #4 0x0000000005743374 in tc_calloc () > #5 0x0000ffff81c737f4 in _dlerror_run (operate=operate@entry=0xffff81c73158 > <dlsym_doit>, args=0xffffffc717d8, args@entry=0xffffffc717f8) at dlerror.c:140 > #6 0x0000ffff81c731f0 in __dlsym (handle=<optimized out>, name=<optimized > out>) at dlsym.c:70 > #7 0x000000000310ee04 in (anonymous namespace)::dlsym_or_die (sym=0x606b260 > "dlopen") at /home/impala/impala/be/src/kudu/util/debug/unwind_safeness.cc:74 > #8 0x000000000310ef1c in (anonymous namespace)::InitIfNecessary () at > /home/impala/impala/be/src/kudu/util/debug/unwind_safeness.cc:100 > #9 0x000000000310f0b4 in dl_iterate_phdr (callback=0xffff81620d18 > <_Unwind_IteratePhdrCallback>, data=0xffffffc71900) at > /home/impala/impala/be/src/kudu/util/debug/unwind_safeness.cc:158 > #10 0x0000ffff816215b4 in _Unwind_Find_FDE (pc=0xffff8161f98f > <_Unwind_Backtrace+79>, bases=bases@entry=0xffffffc72438) at > ../../../gcc-7.5.0/libgcc/unwind-dw2-fde-dip.c:469 > #11 0x0000ffff8161dfdc in uw_frame_state_for > (context=context@entry=0xffffffc72110, fs=fs@entry=0xffffffc719f0) at > ../../../gcc-7.5.0/libgcc/unwind-dw2.c:1249 > #12 0x0000ffff8161ef3c in uw_init_context_1 > (context=context@entry=0xffffffc72110, outer_cfa=0xffffffc72b50, > outer_cfa@entry=0xffffffc72be0, outer_ra=0x55298d8 > <GetStackTrace_libgcc(void**, int, int)+40>) > at ../../../gcc-7.5.0/libgcc/unwind-dw2.c:1578 > #13 0x0000ffff8161f990 in _Unwind_Backtrace (trace=0x5529a48 > <libgcc_backtrace_helper(_Unwind_Context*, void*)>, > trace_argument=0xffffffc72b68) at ../../../gcc-7.5.0/libgcc/unwind.inc:283 > #14 0x00000000055298d8 in GetStackTrace_libgcc(void**, int, int) () > #15 0x0000000005529db4 in GetStackTrace(void**, int, int) () > #16 0x00000000055f891c in tcmalloc::PageHeap::GrowHeap(unsigned long) () > {code} > I think this is same issue with > [https://github.com/gperftools/gperftools/issues/1184] , > because the issue will happen when I building gperftools both with libunwind > and without libunwind . > > And KUDU also has same issue: > https://issues.apache.org/jira/browse/KUDU-3072 > I think the solution in following link is not correct > [https://gerrit.cloudera.org/#/c/15420/] > On aarch64 , the method of getting stacktrace is not same with arm. > I think the correct solution of getting stacktrace is should like this: > [https://github.com/abseil/abseil-cpp/blob/master/absl/debugging/internal/stacktrace_aarch64-inl.inc] > or just use libunwind or use gcc. > > But I think the gperftools maybe not the root cause of this issue, because > both gperftools and libunwind now can support aarch64 perfectly (with > libunwind or gcc). > Maybe this commit of kudu has bug? > [https://github.com/apache/kudu/commit/b621f9c1a3949dc31ca4836b0767b2840fa73f29] > Because on x86, the gperftools will not use libunwind or libgcc to > getstacktrace, so the issue will not happen. > I tried : > {code:java} > #if !defined(THREAD_SANITIZER) && !defined(__APPLE__) > #define HOOK_DL_ITERATE_PHDR 1 > #endif > {code} > change to > {code:java} > #if !defined(THREAD_SANITIZER) && !defined(__APPLE__) && !defined(__aarch64__) > #define HOOK_DL_ITERATE_PHDR 1 > #endif{code} > the deadlock issue will not happen. > > [~tarmstr...@cloudera.com] [~tlipcon] [~adar] > What do you think about this issue? how to fix it? any suggestion? > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org