[ 
https://issues.apache.org/jira/browse/KUDU-2275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved KUDU-2275.
-------------------------------
       Resolution: Fixed
    Fix Version/s: 1.7.0

Upgraded to libunwind 1.3-rc1 to fix this

> SIGSEGV due to bug in libunwind
> -------------------------------
>
>                 Key: KUDU-2275
>                 URL: https://issues.apache.org/jira/browse/KUDU-2275
>             Project: Kudu
>          Issue Type: Bug
>    Affects Versions: 1.6.0
>            Reporter: Will Berkeley
>            Assignee: Todd Lipcon
>            Priority: Major
>             Fix For: 1.7.0
>
>
> Rarely, the kernel stack watchdog can cause a segfault due to a bug in 
> libunwind.
> {noformat}
> *** Aborted at 1516180006 (unix time) try "date -d @1516180006" if you are 
> using GNU date ***
> PC: @ 0x8c94b4 (unknown)
> *** SIGSEGV (@0x7f27173e0000) received by PID 22279 (TID 0x7f270f87f700) from 
> PID 389939200; stack trace: ***{noformat}
> From a core file (produced from the minidump), the backtrace is
> {noformat}
> #0  access_mem (as=<optimized out>, addr=139805870391296, val=0x7f270f87bcc0, 
> write=<optimized out>, arg=<optimized out>)
>    at 
> /usr/src/debug/kudu-1.5.0-cdh5.13.1/thirdparty/src/libunwind-1.1a/src/x86_64/Ginit.c:173
> #1  0x00000000008c8e02 in is_plt_entry (c=0x7f270f87c0e0) at 
> /usr/src/debug/kudu-1.5.0-cdh5.13.1/thirdparty/src/libunwind-1.1a/src/x86_64/Gstep.c:43
> #2  _ULx86_64_step (cursor=0x7f270f87c0e0) at 
> /usr/src/debug/kudu-1.5.0-cdh5.13.1/thirdparty/src/libunwind-1.1a/src/x86_64/Gstep.c:125
> #3  0x00000000008c412d in google::GetStackTrace 
> (result=result@entry=0x292c0c8, max_depth=max_depth@entry=16, skip_count=0, 
> skip_count@entry=2)
>    at 
> /usr/src/debug/kudu-1.5.0-cdh5.13.1/thirdparty/src/glog-0.3.5/src/stacktrace_libunwind-inl.h:78
> #4  0x0000000001a9be8c in Collect (skip_frames=2, this=0x292c0c0) at 
> /usr/src/debug/kudu-1.5.0-cdh5.13.1/src/kudu/util/debug-util.cc:350
> #5  kudu::(anonymous namespace)::HandleStackTraceSignal (signum=<optimized 
> out>) at /usr/src/debug/kudu-1.5.0-cdh5.13.1/src/kudu/util/debug-util.cc:176
> #6  0x00007f2716854670 in _quicksort () from ./lib64/libc.so.6
> #7  0x0000000000000000 in ?? (){noformat}
> Note that addr = 139805870391296 = 0x7f27173e0000.
> The segfault happens because libunwind is accessing invalid memory it's 
> supposed to have validated:
> {code:java}
> /* validate address */
> const struct cursor *c = (const struct cursor *)arg;
> if (likely (c != NULL) && unlikely (c->validate)
>     && unlikely (validate_mem (addr)))
>     return -1;
> *val = *(unw_word_t *) addr;{code}
> [Others|https://lists.nongnu.org/archive/html/libunwind-devel/2016-09/msg00001.html]
>  have seen this same problem before.
> There's also a fix for this issue in commit 
> 836c91c43d7a996028aa7e8d1f53630a6b8e7cbe. It's not in any release of 
> libunwind yet, so we could do one of the following
>  # upgrade libunwind to 1.2 (most recent release) and patch in the fix
>  # upgrade to a snapshot containing the fix
> To workaround, one can set --hung_task_check_interval_ms to a large value 
> like 2^30, so the stack watchdog runs very rarely (although the flag is a 
> 32-bit signed integer, so not too big). The tradeoff is the effective loss of 
> the stack watchdog, which can make debugging certain performance problems 
> more difficult.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to