Bug#963583: sysdig: sysdig segfaults on start

2020-07-06 Thread Adrian Bunk
Control: reassign -1 libgrpc++1 1.26.0-3
Control: retitle -1 libgrpc++1 changes ABI without changing soname
Control: forwarded -1 https://github.com/grpc/grpc/issues/23205
Control: affects -1 src:sysdig

On Tue, Jun 23, 2020 at 06:59:25PM -0700, Dima Kogan wrote:
>...
>   $ sudo sysdig ...
>   sysdig: symbol lookup error: sysdig: undefined symbol: 
> _ZN9grpc_impl23CreateCustomChannelImplERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEERKSt10shared_ptrINS_18ChannelCredentialsEERKNS_16ChannelArgumentsE
> 
> This sounds like #955279, but even if it is, there should be a
> Conflicts, or something, to prevent me from getting into that state.
>...

The root problem is that libgrpc++ changes ABI without changing soname.

sysdig in unstable/testing is the only package in Debian that currently 
uses libgrpc++, which explains why the problem is not reported more often.

Anyone compiling own code against libgrpc++ in buster will run into
the same problem when upgrading to bullseye.

cu
Adrian



Bug#963583: sysdig: sysdig segfaults on start

2020-07-02 Thread Dima Kogan
Harlan Lieberman-Berg  writes:

> Would it be possible for you to use rr to capture a dump of the
> segfault for me?

I tried before filing the report; rr doesn't work with sysdig,
apparently


> Hm, this is strange. I've tested this a couple of different ways,
> including on a clean sid box and a sid box with libc downgraded to the
> same version as on your machine, but I can't recreate the error.

Well, this isn't satisfying. I just ran an experiment, trying to see if
gcc-10 had any hand in this. The kernel module is compiled locally on
the user's box (gcc-10 here, presumably), but the userspace is built by
the Debian boxes (gcc-9) presumably. I just did this:

1. Check out the sysdig sources from upstream. Same tag as the version
   of the Debian package I have.

2. Build sysdig (userspace + kernel) from source with my default
   compiler (gcc-10). This takes forever because it insists on building
   all of its dependencies. There was a small build issue with the
   bundled libgrpc++ that I fixed; it isn't interesting.

3. I installed the just-built kernel module, and ran the just-built
   sysdig tool. No segfault

4. I installed the built-from-package kernel module, that was part of
   the crashing runs earlier, and ran the just-built sysdig tool. No
   segfault

5. I installed the built-from-package kernel module, that was part of
   the crashing runs earlier, and ran the packaged sysdig tool. No
   segfault

#5 is this bug report, but this thing works now. I can't break it
anymore. I haven't installed or removed any packages between now and
when it was broken, so I can't explain this. I should also say that
sysdig was unusable on this box for months, AND I confirmed the segfault
before filing this report, so this bug report wasn't based on a one-off
failure or anything. During those months I did rebuild/rmmod/modprobe
the kernel module several times, and that didn't fix anything.

Let me dogfood this for a few days, and if I consistently can't
reproduce the failure I'll close this bug.

Sorry for the noise



Bug#963583: sysdig: sysdig segfaults on start

2020-07-02 Thread Harlan Lieberman-Berg
On Tue, 23 Jun 2020 18:59:25 -0700 Dima Kogan
 wrote:
>   $ sudo sysdig ...
>   sysdig: symbol lookup error: sysdig: undefined symbol: 
> _ZN9grpc_impl23CreateCustomChannelImplERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEERKSt10shared_ptrINS_18ChannelCredentialsEERKNS_16ChannelArgumentsE
>
> This sounds like #955279, but even if it is, there should be a
> Conflicts, or something, to prevent me from getting into that state. In
> any case, I just

Updating the gprc library was probably the thing that fixed it; there
was a bad libgrpc version.  AFAIK you shouldn't need the libgrpc++-dev
package to run sysdig.

> And now it segfaults again:
>
>   $ sudo sysdig
>   zsh: segmentation fault  sudo sysdig

Hm, this is strange.  I've tested this a couple of different ways,
including on a clean sid box and a sid box with libc downgraded to the
same version as on your machine, but I can't recreate the error.

Would it be possible for you to use rr to capture a dump of the segfault for me?

Sincerely,

-- 
Harlan Lieberman-Berg
~hlieberman



Bug#963583: sysdig: sysdig segfaults on start

2020-06-24 Thread Dima Kogan
Package: sysdig
Version: 0.26.7-2
Severity: grave

Hi. sysdig used to work, but now it doesn't. I'm running Debian/sid, so
probably something about my set of dependencies doesn't agree with
sysdig, but we should figure out what that is.

Earlier today I was seeing a segfault when running some older sysdig
package I had installed. I just upgraded sysdig and sysdig-dkms to the
latest version available:

  $ dpkg -l 'sysdig*'

  ii  sysdig 0.26.7-2 amd64 ...
  ii  sysdig-dkms0.26.7-2 all   ...

And then I got this:

  $ sudo sysdig ...
  sysdig: symbol lookup error: sysdig: undefined symbol: 
_ZN9grpc_impl23CreateCustomChannelImplERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEERKSt10shared_ptrINS_18ChannelCredentialsEERKNS_16ChannelArgumentsE

This sounds like #955279, but even if it is, there should be a
Conflicts, or something, to prevent me from getting into that state. In
any case, I just

  $ sudo apt install libgrpc++-dev

and also upgraded everything that sysdig explicitly depends on, except libc6.

And now it segfaults again:

  $ sudo sysdig
  zsh: segmentation fault  sudo sysdig

The backtrace looks like this:

  #0  0x5575c7cdd210 in sinsp_parser::reset (this=0x5575c8474ac0, 
evt=0x5575c8454ce0) at ./userspace/libsinsp/parsers.cpp:717
  #1  0x5575c7ce3f3d in sinsp_parser::process_event (this=0x5575c8474ac0, 
evt=evt@entry=0x5575c8454ce0) at ./userspace/libsinsp/parsers.cpp:125
  #2  0x5575c7cfa8c9 in sinsp::next (this=0x5575c8454c50, 
puevt=0x7ffd28e5b0b8) at ./userspace/libsinsp/sinsp.cpp:1290
  #3  0x5575c7c016fc in do_inspect (inspector=0x5575c8454c50, 
cnt=18446744073709551615, duration_to_tot_ns=0, quiet=false, json=, do_flush=false, print_progress=false, display_filter=0x0, 
summary_table=std::vector of length 0, capacity 0, 
  formatter=0x7ffd28e5b4e0) at ./userspace/sysdig/sysdig.cpp:604
  #4  0x5575c7c04877 in sysdig_init (argc=, argv=) at ./userspace/sysdig/sysdig.cpp:1596
  #5  0x5575c7bf1fcc in main (argc=, argv=0x7ffd28e5b788) at 
./userspace/sysdig/sysdig.cpp:1694

This is inside the sysdig binary itself. No obvious cause. We're doing
this:

 0x5575c7cdd208 <+920>:   callq  0x5575c7c2dd20 

 0x5575c7cdd20d <+925>:   mov(%rax),%rax
  => 0x5575c7cdd210 <+928>:   mov(%rax),%rax
 0x5575c7cdd213 <+931>:   test   %rax,%rax
 0x5575c7cdd216 <+934>:   jns0x5575c7cdd220 


$rax isn't too crazy-looking, but we can't reference it:

  (gdb) p /x $rax
  $3 = 0x7f3406bcb6ba

  (gdb) x /32xb $rax
  0x7f3406bcb6ba: Cannot access memory at address 0x7f3406bcb6ba

I don't know if $rip is AT the offending instruction of the instruction
right after the offending instruction. So not entirely sure if rax is
parinfo or parinfo->m_val. Anyway...

Notes:

1. I upgraded everything sysdig Depends: on except libc6. Upgrading that
   would force me to upgrade my python, and that makes me touch stuff
   I'd rather not touch right now

2. I have gcc-10 installed, so that's where the libgcc... and
   libstdc++... are coming from

Is this enough info?

-- System Information:
Debian Release: buster/sid
  APT prefers unstable-debug
  APT policy: (500, 'unstable-debug'), (500, 'unstable'), (500, 'stable')
Architecture: amd64 (x86_64)
Foreign Architectures: arm64, armhf

Kernel: Linux 4.17.0-1-amd64 (SMP w/20 CPU cores)
Locale: LANG=C, LC_CTYPE=C (charmap=ANSI_X3.4-1968) (ignored: LC_ALL set to C), 
LANGUAGE=C (charmap=ANSI_X3.4-1968) (ignored: LC_ALL set to C)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages sysdig depends on:
ii  libb64-0d1.2-5+b1
ii  libc62.30-2
ii  libcurl4 7.68.0-1
ii  libelf1  0.176-1.1
ii  libgcc-s110.1.0-4
ii  libgrpc++1   1.26.0-3
ii  libjq1   1.6-1
ii  libjsoncpp1  1.7.4-3.1
ii  libluajit-5.1-2  2.1.0~beta3+dfsg-5.1
ii  libncurses6  6.2-1
ii  libprotobuf223.11.4-5
ii  libssl1.11.1.1g-1
ii  libstdc++6   10.1.0-4
ii  libtbb2  2020.2-2
ii  libtinfo66.2-1
ii  zlib1g   1:1.2.11.dfsg-2

Versions of packages sysdig recommends:
ii  sysdig-dkms  0.26.7-2

sysdig suggests no packages.

-- no debconf information