[ 
https://issues.apache.org/jira/browse/IMPALA-10088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhaorenhai resolved IMPALA-10088.
---------------------------------
      Assignee: zhaorenhai
    Resolution: Fixed

> DeadLock while run unifiedbetests on aarch64 platform
> -----------------------------------------------------
>
>                 Key: IMPALA-10088
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10088
>             Project: IMPALA
>          Issue Type: Sub-task
>            Reporter: zhaorenhai
>            Assignee: zhaorenhai
>            Priority: Major
>
> When run unifiedbetests and impalad on aarch64 platform, when init tcmalloc, 
> will happen deadlock.
> The stacktrace is as following:
>  
> {code:java}
> (gdb) bt
> #0  0x0000ffff83099544 in __GI___nanosleep (requested_time=0xffffffc71698, 
> remaining=0x0) at ../sysdeps/unix/sysv/linux/nanosleep.c:28
> #1  0x00000000054cf144 in base::internal::SpinLockDelay (w=0x77385b0 
> <tcmalloc::Static::pageheap_lock_>, value=2, loop=727956) at 
> /home/impala/impala/be/src/gutil/spinlock_linux-inl.h:86
> #2  0x0000000005529800 in SpinLock::SlowLock() ()
> #3  0x00000000055fb5c4 in tcmalloc::ThreadCache::InitModule() ()
> #4  0x0000000005743374 in tc_calloc ()
> #5  0x0000ffff81c737f4 in _dlerror_run (operate=operate@entry=0xffff81c73158 
> <dlsym_doit>, args=0xffffffc717d8, args@entry=0xffffffc717f8) at dlerror.c:140
> #6  0x0000ffff81c731f0 in __dlsym (handle=<optimized out>, name=<optimized 
> out>) at dlsym.c:70
> #7  0x000000000310ee04 in (anonymous namespace)::dlsym_or_die (sym=0x606b260 
> "dlopen") at /home/impala/impala/be/src/kudu/util/debug/unwind_safeness.cc:74
> #8  0x000000000310ef1c in (anonymous namespace)::InitIfNecessary () at 
> /home/impala/impala/be/src/kudu/util/debug/unwind_safeness.cc:100
> #9  0x000000000310f0b4 in dl_iterate_phdr (callback=0xffff81620d18 
> <_Unwind_IteratePhdrCallback>, data=0xffffffc71900) at 
> /home/impala/impala/be/src/kudu/util/debug/unwind_safeness.cc:158
> #10 0x0000ffff816215b4 in _Unwind_Find_FDE (pc=0xffff8161f98f 
> <_Unwind_Backtrace+79>, bases=bases@entry=0xffffffc72438) at 
> ../../../gcc-7.5.0/libgcc/unwind-dw2-fde-dip.c:469
> #11 0x0000ffff8161dfdc in uw_frame_state_for 
> (context=context@entry=0xffffffc72110, fs=fs@entry=0xffffffc719f0) at 
> ../../../gcc-7.5.0/libgcc/unwind-dw2.c:1249
> #12 0x0000ffff8161ef3c in uw_init_context_1 
> (context=context@entry=0xffffffc72110, outer_cfa=0xffffffc72b50, 
> outer_cfa@entry=0xffffffc72be0, outer_ra=0x55298d8 
> <GetStackTrace_libgcc(void**, int, int)+40>)
>     at ../../../gcc-7.5.0/libgcc/unwind-dw2.c:1578
> #13 0x0000ffff8161f990 in _Unwind_Backtrace (trace=0x5529a48 
> <libgcc_backtrace_helper(_Unwind_Context*, void*)>, 
> trace_argument=0xffffffc72b68) at ../../../gcc-7.5.0/libgcc/unwind.inc:283
> #14 0x00000000055298d8 in GetStackTrace_libgcc(void**, int, int) ()
> #15 0x0000000005529db4 in GetStackTrace(void**, int, int) ()
> #16 0x00000000055f891c in tcmalloc::PageHeap::GrowHeap(unsigned long) ()
> {code}
> I think this is same issue with 
> [https://github.com/gperftools/gperftools/issues/1184] ,
> because the issue will happen  when I building gperftools both with libunwind 
> and without libunwind .
>  
> And KUDU also has same issue:
> https://issues.apache.org/jira/browse/KUDU-3072
> I think the  solution in following link is not correct
> [https://gerrit.cloudera.org/#/c/15420/]
> On aarch64 , the method of getting stacktrace is not same with arm.
> I think the correct solution of getting stacktrace is should like this:
> [https://github.com/abseil/abseil-cpp/blob/master/absl/debugging/internal/stacktrace_aarch64-inl.inc]
>  or just use libunwind or use gcc.
>  
> But I think the gperftools maybe not the root cause of this issue, because 
> both gperftools and libunwind now can support aarch64 perfectly (with 
> libunwind or gcc).
> Maybe this commit of kudu has bug?
> [https://github.com/apache/kudu/commit/b621f9c1a3949dc31ca4836b0767b2840fa73f29]
> Because on x86, the gperftools will not use libunwind or libgcc to 
> getstacktrace, so the issue will not happen.
> I tried :
> {code:java}
> #if !defined(THREAD_SANITIZER) && !defined(__APPLE__)
> #define HOOK_DL_ITERATE_PHDR 1
> #endif
> {code}
> change to 
> {code:java}
> #if !defined(THREAD_SANITIZER) && !defined(__APPLE__) && !defined(__aarch64__)
> #define HOOK_DL_ITERATE_PHDR 1
> #endif{code}
> the deadlock issue will not happen.
>  
> [~tarmstr...@cloudera.com] [~tlipcon] [~adar]
> What do you think about this issue? how to fix it? any suggestion?
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to