[valgrind] [Bug 466172] SIGTRAP crash whenever getaddrinfo call is issued by valgrind
https://bugs.kde.org/show_bug.cgi?id=466172 --- Comment #8 from Mike J --- Hi. This bug can be closed. It is not caused by valgrind. In case it is of use in future to anyone, further checks have shown that TaniumClient version 7.4.9.1046 was running on the system and caused the problem. valgrind was working on the system with an earlier TaniumClient release, but stopped working when the TaniumClient package was upgraded late last year, affecting valgrind runs. As indirectly noted in the earlier comments, the C library getaddrinfo() function is dynamically loaded when first called by an application. I took valgrind out of the picture and ran "/usr/bin/hostname -d" under the control of the gdb debugger. - With TaniumClient running, on entry to getaddrinfo(), the int3 instruction is seen instead of the expected push %rbp instruction, corrupting the call stack and raising a SIGTRAP. If run with valgrind, valgrind catches the raised SIGTRAP signal in this case and exits with a core dump, showing a corrupt call stack. - When TaniumClient is stopped, the expected push %rbp instruction is instead seen. If run with valgrind, it runs correctly and completes normally. Thanks to Paul Floyd and Mark Wielaard for checking the problem and debugging advice which pointed me in the right direction for problem diagnosis. -- You are receiving this mail because: You are watching all bug changes.
[valgrind] [Bug 466172] SIGTRAP crash whenever getaddrinfo call is issued by valgrind
https://bugs.kde.org/show_bug.cgi?id=466172 --- Comment #6 from Mike J --- Thanks Paul. I was unaware of TUI mode, its really useful. The following extract is from the TUI asm and command windows. It shows a int3 rather than a push %rbp on the initial entry, where the call stack still shows as being normal. On the stepi (rather than a step tried previously), the call stack has then become corrupted. Is the int3 likely to be something that valgrind might introduce instead of the push %rbp ? B+>x0x534b5e0 <__GI_getaddrinfo>int3 x0x534b5e1 <__GI_getaddrinfo+1> mov%rsp,%rbp x0x534b5e4 <__GI_getaddrinfo+4> push %r15 x0x534b5e6 <__GI_getaddrinfo+6> push %r14 x0x534b5e8 <__GI_getaddrinfo+8> mov%rdi,%r14 x0x534b5eb <__GI_getaddrinfo+11> push %r13 x0x534b5ed <__GI_getaddrinfo+13> mov%rsi,%r13 x0x534b5f0 <__GI_getaddrinfo+16> push %r12 x0x534b5f2 <__GI_getaddrinfo+18> mov%rdx,%r12 x0x534b5f5 <__GI_getaddrinfo+21> push %rbx x0x534b5f6 <__GI_getaddrinfo+22> sub$0x518,%rsp x0x534b5fd <__GI_getaddrinfo+29> test %rdi,%rdi x0x534b600 <__GI_getaddrinfo+32> mov%rcx,-0x530(%rbp) (gdb) where #0 __GI_getaddrinfo (name=0x5632040 "hostname.localdomain", service=service@entry=0x0, hints=hints@entry=0x1ffefff930, pai=pai@entry=0x1ffefff928) at ../sysdeps/posix/getaddrinfo.c:2208 #1 0x00401b19 in show_name (type=type@entry=DNS) at hostname.c:339 #2 0x004013e4 in main (argc=2, argv=0x1ffefffb98) at hostname.c:550 (gdb) stepi stepi >x0x534b5e1 <__GI_getaddrinfo+1> mov%rsp,%rbp x0x534b5e4 <__GI_getaddrinfo+4> push %r15 x0x534b5e6 <__GI_getaddrinfo+6> push %r14 x0x534b5e8 <__GI_getaddrinfo+8> mov%rdi,%r14 x0x534b5eb <__GI_getaddrinfo+11> push %r13 x0x534b5ed <__GI_getaddrinfo+13> mov%rsi,%r13 x0x534b5f0 <__GI_getaddrinfo+16> push %r12 x0x534b5f2 <__GI_getaddrinfo+18> mov%rdx,%r12 x0x534b5f5 <__GI_getaddrinfo+21> push %rbx x0x534b5f6 <__GI_getaddrinfo+22> sub$0x518,%rsp x0x534b5fd <__GI_getaddrinfo+29> test %rdi,%rdi x0x534b600 <__GI_getaddrinfo+32> mov%rcx,-0x530(%rbp) x0x534b607 <__GI_getaddrinfo+39> movq $0x0,-0x4c0(%rbp) (gdb) where where #0 0x0534b5e1 in __GI_getaddrinfo (name=0x5632040 "hostname.localdomain", service=0x0, hints=0x1ffefff930, pai=0x1ffefff928) at ../sysdeps/posix/getaddrinfo.c:2208 #1 0x001ffefffa40 in ?? () #2 0x0529d226 in __GI_getenv (name=0x1ffefff930 "\002") at getenv.c:35 #3 0x in ?? () -- You are receiving this mail because: You are watching all bug changes.
[valgrind] [Bug 466172] SIGTRAP crash whenever getaddrinfo call is issued by valgrind
https://bugs.kde.org/show_bug.cgi?id=466172 --- Comment #4 from Mike J --- It was noted that we have Dynatrace OneAgent installed, which preloads one of its libraries by adding it to /etc/ld.so.preload. Although originally thought it might be involved in the problem, we have concluded today that it is not involved, by doing two valgrind runs with gdb attached on hostname -d, with and without the preloaded library. The two run details are shown below with gdb output, in the hope that somebody can spot something untoward that valgrind may be doing. In the first run, /etc/ld.so.preload is set up to dynamically link in a Dynatrace OneAgent library for each program started. In the second run, /etc/ld.so.preload is renamed and ldconfig run to relink runtime shared library cache GDB is attached once valgrind is started. Breakpoints are set on show_name and getaddrinfo, but stepped through from the show_name breakpoint to also watch the dynamic linker behaviour in loading the getaddrinfo call. The initial step from show_name() to getaddrinfo() call shows dynamic linker involved in loading call from glibc library. On first entry into getaddrinfo function, the callstack is OK. On the next step instruction, the callstack becomes corrupted. Continuing on leads to a SIGSEGV, rather than a SIGTRAP, which crashes the program. Both runs are identical in outcome. If the debugger is not attached, a SIGTRAP is instead raised, which crashes the program. Lines with @ below are eye catchers for relevant notes @ First run @ [auser@hostname ~]$ cat /etc/ld.so.preload /$LIB/liboneagentproc.so [auser@hostname ~]$ ldd /usr/bin/hostname linux-vdso.so.1 => (0x7ffd02d96000) /$LIB/liboneagentproc.so => /lib64/liboneagentproc.so (0x2af8ce652000) libnsl.so.1 => /usr/lib64/libnsl.so.1 (0x2af8ce86) libc.so.6 => /usr/lib64/libc.so.6 (0x2af8cea7a000) /lib64/ld-linux-x86-64.so.2 (0x2af8ce42e000) Terminal 1 valgrind --trace-signals=yes -v --log-file=valgrind.out.2 --vgdb=full --vgdb-stop-at=startup hostname -d Terminal 2 cat valgrind.out.2 ==77535== Memcheck, a memory error detector ==77535== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al. ==77535== Using Valgrind-3.20.0-5147d671e4-20221024 and LibVEX; rerun with -h for copyright info ==77535== Command: hostname -d ==77535== Parent PID: 111647 ==77535== --77535-- --77535-- Valgrind options: --77535----trace-signals=yes --77535---v --77535----log-file=valgrind.out.2 --77535----vgdb=full --77535----vgdb-stop-at=startup --77535-- Contents of /proc/version: --77535-- Linux version 3.10.0-1160.81.1.el7.x86_64 (mockbu...@x86-vm-38.build.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-44) (GCC) ) #1 SMP Thu Nov 24 12:21:22 UTC 2022 --77535-- --77535-- Arch and hwcaps: AMD64, LittleEndian, amd64-cx16-lzcnt-rdtscp-sse3-ssse3-avx-avx2-bmi-f16c-rdrand --77535-- Page sizes: currently 4096, max supported 4096 --77535-- Valgrind library directory: /home/auser/local/libexec/valgrind --77535-- Reading syms from /usr/bin/hostname --77535-- Considering /usr/lib/debug/.build-id/93/633698bd11eeb4bee21a388c191a5656990d8e.debug .. --77535-- .. build-id is valid --77535-- Reading syms from /usr/lib64/ld-2.17.so --77535-- Considering /usr/lib/debug/.build-id/62/c449974331341bb08dcce3859560a22af1e172.debug .. --77535-- .. build-id is valid --77535-- Reading syms from /home/auser/local/libexec/valgrind/memcheck-amd64-linux --77535--object doesn't have a dynamic symbol table --77535-- Scheduler: using generic scheduler lock implementation. --77535-- Max kernel-supported signal is 64, VG_SIGVGKILL is 64 --77535-- Reading suppressions file: /home/auser/local/libexec/valgrind/default.supp ==77535== (action at startup) vgdb me ... ==77535== embedded gdbserver: reading from /tmp/vgdb-pipe-from-vgdb-to-77535-by-auser-on-hostname.localdomain ==77535== embedded gdbserver: writing to /tmp/vgdb-pipe-to-vgdb-from-77535-by-auser-on-hostname.localdomain ==77535== embedded gdbserver: shared mem /tmp/vgdb-pipe-shared-mem-vgdb-77535-by-auser-on-hostname.localdomain ==77535== ==77535== TO CONTROL THIS PROCESS USING vgdb (which you probably ==77535== don't want to do, unless you know exactly what you're doing, ==77535== or are doing some strange experiment): ==77535== /home/auser/local/libexec/valgrind/../../bin/vgdb --pid=77535 ...command... ==77535== ==77535== TO DEBUG THIS PROCESS USING GDB: start GDB like this ==77535== /path/to/gdb hostname ==77535== and then give GDB the following command ==77535== target remote | /home/auser/local/libexec/valgrind/../../bin/vgdb --pid=77535 ==77535== --pid is optional if only one valgrind process is running ==77535== [auser@hostname ~]$ gdb /usr/bin/hostname Copyright (C) 2013 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is
[valgrind] [Bug 466172] SIGTRAP crash whenever getaddrinfo call is issued by valgrind
https://bugs.kde.org/show_bug.cgi?id=466172 --- Comment #3 from Mike J --- Thanks. Although the sysadmins installed the correct debuginfo for glibc and hostname today, I won't have collatable results from this until 23rd Feb. I'll provide an update then -- You are receiving this mail because: You are watching all bug changes.
[valgrind] [Bug 466172] New: SIGTRAP crash whenever getaddrinfo call is issued by valgrind
https://bugs.kde.org/show_bug.cgi?id=466172 Bug ID: 466172 Summary: SIGTRAP crash whenever getaddrinfo call is issued by valgrind Classification: Developer tools Product: valgrind Version: 3.20.0 Platform: RedHat Enterprise Linux OS: Linux Status: REPORTED Severity: crash Priority: NOR Component: memcheck Assignee: jsew...@acm.org Reporter: do.not.spam.me.kde.bugzi...@gmail.com Target Milestone: --- SUMMARY On my work RHEL 7.9 development system, valgrind coredumps and exits following a SIGTRAP generated when calling the C library getaddrinfo function. This occurs when running valgrind on our internal applications that call getaddrinfo(), and also when running valgrind on the 'hostname -d' command. This will produce a core file for the program. Output from valgrind is available below, as is gdb output from the core file. The problem occurs with the memcheck tool, and also the callgrind tool. Further details given below are from running valgrind with hostname -d. No failure occurs when running valgrind when hostname has no args, as it does not generate a getaddrinfo call. Similar results were seen when running the native valgrind v3.15 install for RHEL 7.9, as seen when running a locally compiled valgrind v3.20 release. The stack trace in the core file does not make a lot of sense as the system doesn't currently have debuginfo packages installed. I'm trying to arrange these for the glibc and hostname packages via a sysadmin, but this is not available at the moment. On my home host with VirtualBox, using RHEL 7.9, RHEL 8.7 and Fedora 35, running valgrind with signal tracing turned on against 'hostname -d' shows two SIGSEGV signals are raised but handled by valgrind and it continues to successful command completion. I can't comment as to whether this should or shouldn't happen, I've just used this to establish an expected baseline. When run with strace instead of valgrind, no signals are found to be raised. When running strace with valgrind and hostname -d, both strace and valgrind report the two SIGSEGV signals. STEPS TO REPRODUCE 1. valgrind --trace-signals=yes -v hostname -d 2. 3. OBSERVED RESULT valgrind handles a single SIGSEGV and continues, then handles a SIGTRAP and core dumps. valgrind output indicates the crash happened during a getaddrinfo call. EXPECTED RESULT Program being run with valgrind should survive a call to getaddrinfo and continue running, so that memory leak checking can be carried out. SOFTWARE/OS VERSIONS Linux: RHEL 7.9 server without GUI installed ADDITIONAL INFORMATION Valgrind generates the following output ==80114== Memcheck, a memory error detector ==80114== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al. ==80114== Using Valgrind-3.20.0-5147d671e4-20221024 and LibVEX; rerun with -h for copyright info ==80114== Command: hostname -d ==80114== Parent PID: 20454 ==80114== --80114-- --80114-- Valgrind options: --80114----trace-signals=yes --80114---v --80114----log-file=valgrind.out --80114-- Contents of /proc/version: --80114-- Linux version 3.10.0-1160.81.1.el7.x86_64 (mockbu...@x86-vm-38.build.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-44) (GCC) ) #1 SMP Thu Nov 24 12:21:22 UTC 2022 --80114-- --80114-- Arch and hwcaps: AMD64, LittleEndian, amd64-cx16-lzcnt-rdtscp-sse3-ssse3-avx-avx2-bmi-f16c-rdrand --80114-- Page sizes: currently 4096, max supported 4096 --80114-- Valgrind library directory: /home/auser/local/libexec/valgrind --80114-- Reading syms from /usr/bin/hostname --80114--object doesn't have a symbol table --80114-- Reading syms from /usr/lib64/ld-2.17.so --80114-- Reading syms from /home/auser/local/libexec/valgrind/memcheck-amd64-linux --80114--object doesn't have a dynamic symbol table --80114-- Scheduler: using generic scheduler lock implementation. --80114-- Max kernel-supported signal is 64, VG_SIGVGKILL is 64 --80114-- Reading suppressions file: /home/auser/local/libexec/valgrind/default.supp ==80114== embedded gdbserver: reading from /tmp/vgdb-pipe-from-vgdb-to-80114-by-auser-on-hostname.localdomain ==80114== embedded gdbserver: writing to /tmp/vgdb-pipe-to-vgdb-from-80114-by-auser-on-hostname.localdomain ==80114== embedded gdbserver: shared mem /tmp/vgdb-pipe-shared-mem-vgdb-80114-by-auser-on-hostname.localdomain ==80114== ==80114== TO CONTROL THIS PROCESS USING vgdb (which you probably ==80114== don't want to do, unless you know exactly what you're doing, ==80114== or are doing some strange experiment): ==80114== /home/auser/local/libexec/valgrind/../../bin/vgdb --pid=80114 ...command... ==80114== ==80114== TO DEBUG THIS PROCESS USING GDB: start GDB like this ==80114== /path/to/gdb hostname ==80114== and then give GDB the following command ==80114== target remote | /home/auser/local/libexec/valgrind/../../bin/vgdb