Bug#963583: sysdig: sysdig segfaults on start
Control: reassign -1 libgrpc++1 1.26.0-3 Control: retitle -1 libgrpc++1 changes ABI without changing soname Control: forwarded -1 https://github.com/grpc/grpc/issues/23205 Control: affects -1 src:sysdig On Tue, Jun 23, 2020 at 06:59:25PM -0700, Dima Kogan wrote: >... > $ sudo sysdig ... > sysdig: symbol lookup error: sysdig: undefined symbol: > _ZN9grpc_impl23CreateCustomChannelImplERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEERKSt10shared_ptrINS_18ChannelCredentialsEERKNS_16ChannelArgumentsE > > This sounds like #955279, but even if it is, there should be a > Conflicts, or something, to prevent me from getting into that state. >... The root problem is that libgrpc++ changes ABI without changing soname. sysdig in unstable/testing is the only package in Debian that currently uses libgrpc++, which explains why the problem is not reported more often. Anyone compiling own code against libgrpc++ in buster will run into the same problem when upgrading to bullseye. cu Adrian
Bug#963583: sysdig: sysdig segfaults on start
Harlan Lieberman-Berg writes: > Would it be possible for you to use rr to capture a dump of the > segfault for me? I tried before filing the report; rr doesn't work with sysdig, apparently > Hm, this is strange. I've tested this a couple of different ways, > including on a clean sid box and a sid box with libc downgraded to the > same version as on your machine, but I can't recreate the error. Well, this isn't satisfying. I just ran an experiment, trying to see if gcc-10 had any hand in this. The kernel module is compiled locally on the user's box (gcc-10 here, presumably), but the userspace is built by the Debian boxes (gcc-9) presumably. I just did this: 1. Check out the sysdig sources from upstream. Same tag as the version of the Debian package I have. 2. Build sysdig (userspace + kernel) from source with my default compiler (gcc-10). This takes forever because it insists on building all of its dependencies. There was a small build issue with the bundled libgrpc++ that I fixed; it isn't interesting. 3. I installed the just-built kernel module, and ran the just-built sysdig tool. No segfault 4. I installed the built-from-package kernel module, that was part of the crashing runs earlier, and ran the just-built sysdig tool. No segfault 5. I installed the built-from-package kernel module, that was part of the crashing runs earlier, and ran the packaged sysdig tool. No segfault #5 is this bug report, but this thing works now. I can't break it anymore. I haven't installed or removed any packages between now and when it was broken, so I can't explain this. I should also say that sysdig was unusable on this box for months, AND I confirmed the segfault before filing this report, so this bug report wasn't based on a one-off failure or anything. During those months I did rebuild/rmmod/modprobe the kernel module several times, and that didn't fix anything. Let me dogfood this for a few days, and if I consistently can't reproduce the failure I'll close this bug. Sorry for the noise
Bug#963583: sysdig: sysdig segfaults on start
On Tue, 23 Jun 2020 18:59:25 -0700 Dima Kogan wrote: > $ sudo sysdig ... > sysdig: symbol lookup error: sysdig: undefined symbol: > _ZN9grpc_impl23CreateCustomChannelImplERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEERKSt10shared_ptrINS_18ChannelCredentialsEERKNS_16ChannelArgumentsE > > This sounds like #955279, but even if it is, there should be a > Conflicts, or something, to prevent me from getting into that state. In > any case, I just Updating the gprc library was probably the thing that fixed it; there was a bad libgrpc version. AFAIK you shouldn't need the libgrpc++-dev package to run sysdig. > And now it segfaults again: > > $ sudo sysdig > zsh: segmentation fault sudo sysdig Hm, this is strange. I've tested this a couple of different ways, including on a clean sid box and a sid box with libc downgraded to the same version as on your machine, but I can't recreate the error. Would it be possible for you to use rr to capture a dump of the segfault for me? Sincerely, -- Harlan Lieberman-Berg ~hlieberman
Bug#963583: sysdig: sysdig segfaults on start
Package: sysdig Version: 0.26.7-2 Severity: grave Hi. sysdig used to work, but now it doesn't. I'm running Debian/sid, so probably something about my set of dependencies doesn't agree with sysdig, but we should figure out what that is. Earlier today I was seeing a segfault when running some older sysdig package I had installed. I just upgraded sysdig and sysdig-dkms to the latest version available: $ dpkg -l 'sysdig*' ii sysdig 0.26.7-2 amd64 ... ii sysdig-dkms0.26.7-2 all ... And then I got this: $ sudo sysdig ... sysdig: symbol lookup error: sysdig: undefined symbol: _ZN9grpc_impl23CreateCustomChannelImplERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEERKSt10shared_ptrINS_18ChannelCredentialsEERKNS_16ChannelArgumentsE This sounds like #955279, but even if it is, there should be a Conflicts, or something, to prevent me from getting into that state. In any case, I just $ sudo apt install libgrpc++-dev and also upgraded everything that sysdig explicitly depends on, except libc6. And now it segfaults again: $ sudo sysdig zsh: segmentation fault sudo sysdig The backtrace looks like this: #0 0x5575c7cdd210 in sinsp_parser::reset (this=0x5575c8474ac0, evt=0x5575c8454ce0) at ./userspace/libsinsp/parsers.cpp:717 #1 0x5575c7ce3f3d in sinsp_parser::process_event (this=0x5575c8474ac0, evt=evt@entry=0x5575c8454ce0) at ./userspace/libsinsp/parsers.cpp:125 #2 0x5575c7cfa8c9 in sinsp::next (this=0x5575c8454c50, puevt=0x7ffd28e5b0b8) at ./userspace/libsinsp/sinsp.cpp:1290 #3 0x5575c7c016fc in do_inspect (inspector=0x5575c8454c50, cnt=18446744073709551615, duration_to_tot_ns=0, quiet=false, json=, do_flush=false, print_progress=false, display_filter=0x0, summary_table=std::vector of length 0, capacity 0, formatter=0x7ffd28e5b4e0) at ./userspace/sysdig/sysdig.cpp:604 #4 0x5575c7c04877 in sysdig_init (argc=, argv=) at ./userspace/sysdig/sysdig.cpp:1596 #5 0x5575c7bf1fcc in main (argc=, argv=0x7ffd28e5b788) at ./userspace/sysdig/sysdig.cpp:1694 This is inside the sysdig binary itself. No obvious cause. We're doing this: 0x5575c7cdd208 <+920>: callq 0x5575c7c2dd20 0x5575c7cdd20d <+925>: mov(%rax),%rax => 0x5575c7cdd210 <+928>: mov(%rax),%rax 0x5575c7cdd213 <+931>: test %rax,%rax 0x5575c7cdd216 <+934>: jns0x5575c7cdd220 $rax isn't too crazy-looking, but we can't reference it: (gdb) p /x $rax $3 = 0x7f3406bcb6ba (gdb) x /32xb $rax 0x7f3406bcb6ba: Cannot access memory at address 0x7f3406bcb6ba I don't know if $rip is AT the offending instruction of the instruction right after the offending instruction. So not entirely sure if rax is parinfo or parinfo->m_val. Anyway... Notes: 1. I upgraded everything sysdig Depends: on except libc6. Upgrading that would force me to upgrade my python, and that makes me touch stuff I'd rather not touch right now 2. I have gcc-10 installed, so that's where the libgcc... and libstdc++... are coming from Is this enough info? -- System Information: Debian Release: buster/sid APT prefers unstable-debug APT policy: (500, 'unstable-debug'), (500, 'unstable'), (500, 'stable') Architecture: amd64 (x86_64) Foreign Architectures: arm64, armhf Kernel: Linux 4.17.0-1-amd64 (SMP w/20 CPU cores) Locale: LANG=C, LC_CTYPE=C (charmap=ANSI_X3.4-1968) (ignored: LC_ALL set to C), LANGUAGE=C (charmap=ANSI_X3.4-1968) (ignored: LC_ALL set to C) Shell: /bin/sh linked to /bin/dash Init: systemd (via /run/systemd/system) LSM: AppArmor: enabled Versions of packages sysdig depends on: ii libb64-0d1.2-5+b1 ii libc62.30-2 ii libcurl4 7.68.0-1 ii libelf1 0.176-1.1 ii libgcc-s110.1.0-4 ii libgrpc++1 1.26.0-3 ii libjq1 1.6-1 ii libjsoncpp1 1.7.4-3.1 ii libluajit-5.1-2 2.1.0~beta3+dfsg-5.1 ii libncurses6 6.2-1 ii libprotobuf223.11.4-5 ii libssl1.11.1.1g-1 ii libstdc++6 10.1.0-4 ii libtbb2 2020.2-2 ii libtinfo66.2-1 ii zlib1g 1:1.2.11.dfsg-2 Versions of packages sysdig recommends: ii sysdig-dkms 0.26.7-2 sysdig suggests no packages. -- no debconf information