[ 
https://issues.apache.org/jira/browse/ARROW-17093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17577386#comment-17577386
 ] 

Ben Kietzman edited comment on ARROW-17093 at 8/9/22 12:26 PM:
---------------------------------------------------------------

... however, having written that I think the correct solution to the 
all-threads-trace problem is allowing the process to core dump then reading 
stacks out of that. This has two advantages over in-process tracing:
- When a signal handler exists, the non-signaled threads continue execution 
until they receive signals of their own. However if a signal is known to be 
fatal, the OS can shut threads down more aggressively- this means we can get 
less out-of-date traces from the threads which *didn't* segfault than we can 
with interthread signals
- We'd probably be reading the core dump with gdb or another debugger and we'd 
have access to the process' full memory, so we could print not just snippets of 
the source files but values of local variables as well


was (Author: bkietz):
... however, having written that I think the correct solution to the 
all-threads-trace problem is allowing the process to core dump then reading 
stacks out of that. This has two advantages over in-process tracing:
- When a signal handler exists, the non-signaled threads continue execution 
until they receive signals of their own. However if a signal is known to be 
fatal, the OS can shut threads down more aggressively- this means we can get a 
less out-of-date traces from the threads which *didn't* segfault than we can 
with interthread signals
- We'd probably be reading the core dump with gdb or another debugger and we'd 
have access to the process' full memory, so we could print not just snippets of 
the source files but values of local variables as well

> [C++][CI] Enable libSegFault for C++ tests
> ------------------------------------------
>
>                 Key: ARROW-17093
>                 URL: https://issues.apache.org/jira/browse/ARROW-17093
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++, Continuous Integration
>            Reporter: David Li
>            Priority: Major
>
> Adding libSegFault.so could make it easier to diagnose CI failures. It will 
> print a backtrace on segfault.
> {noformat}
>   env SEGFAULT_SIGNALS=all \
>       LD_PRELOAD=/lib/x86_64-linux-gnu/libSegFault.so
> {noformat}
> This will give a backtrace like this on segfault:
> {noformat}
> Backtrace:
> /lib/x86_64-linux-gnu/libc.so.6(gsignal+0xcb)[0x7f8f4a0b900b]
> /lib/x86_64-linux-gnu/libc.so.6(abort+0x12b)[0x7f8f4a098859]
> /lib/x86_64-linux-gnu/libc.so.6(+0x8d26e)[0x7f8f4a10326e]
> /lib/x86_64-linux-gnu/libc.so.6(+0x952fc)[0x7f8f4a10b2fc]
> /lib/x86_64-linux-gnu/libc.so.6(+0x96f6d)[0x7f8f4a10cf6d]
> /tmp/arrow-HEAD.y8UwB/cpp-build/release/flight-test-integration-client(_ZNSt8_Rb_treeISt10shared_ptrIN5arrow8DataTypeEES3_St9_IdentityIS3_ESt4lessIS3_ESaIS3_EE8_M_eraseEPSt13_Rb_tree_nodeIS3_E+0x39)[0x5557a9a83b19]
> /tmp/arrow-HEAD.y8UwB/cpp-build/release/flight-test-integration-client(_ZNSt8_Rb_treeISt10shared_ptrIN5arrow8DataTypeEES3_St9_IdentityIS3_ESt4lessIS3_ESaIS3_EE8_M_eraseEPSt13_Rb_tree_nodeIS3_E+0x1f)[0x5557a9a83aff]
> /tmp/arrow-HEAD.y8UwB/cpp-build/release/flight-test-integration-client(_ZNSt3setISt10shared_ptrIN5arrow8DataTypeEESt4lessIS3_ESaIS3_EED1Ev+0x33)[0x5557a9a83b83]
> /lib/x86_64-linux-gnu/libc.so.6(__cxa_finalize+0xce)[0x7f8f4a0bcfde]
> /tmp/arrow-HEAD.y8UwB/cpp-build/release/libarrow.so.900(+0x440b67)[0x7f8f47d56b67]
> {noformat}
> Caveats:
>  * The path is OS-specific
>  * We could integrate it into the build tooling instead of doing it via env 
> var
>  * Are there easily accessible equivalents for MacOS and Windows we could use?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to