+dev lists Peter Mikus Engineer - Software Cisco Systems Limited
> -----Original Message----- > From: Peter Mikus -X (pmikus - PANTHEON TECH SRO at Cisco) > Sent: Friday, November 29, 2019 11:06 AM > To: Benoit Ganne (bganne) <bga...@cisco.com>; Juraj Linkeš > <juraj.lin...@pantheon.tech>; Maciek Konstantynowicz (mkonstan) > <mkons...@cisco.com> > Cc: Vratko Polak -X (vrpolak - PANTHEON TECH SRO at Cisco) > <vrpo...@cisco.com>; Benoit Ganne (bganne) <bga...@cisco.com>; > lijian.zh...@arm.com; Honnappa Nagarahalli <honnappa.nagaraha...@arm.com> > Subject: CSIT - performance tests failing on Taishan > > Hello all, > > In CSIT we are observing the issue with Taishan boxes where performance > tests are failing. > There has been long misleading discussion about the potential issue, root > cause and what workaround to apply. > > Issue > ===== > VPP is being restarted after an attempt to read "show pci" over the > socket on '/run/vpp/cli.sock' > in a loop. This loop test is executed in CSIT towards VPP with default > startup configuration via command below to check if VPP is really UP and > responding. > > How to reproduce > ================ > for i in $(seq 1 120); do echo "show pci" | sudo socat - UNIX- > CONNECT:/run/vpp/cli.sock; sudo netstat -ap | grep vpp; done > > The same can be reproduced using vppctl: > > for i in $(seq 1 120); do echo "show pci" | sudo vppctl; sudo netstat -ap > | grep vpp; done > > To eliminate the issue with test itself I used "show version" > for i in $(seq 1 120); do echo "show version" | sudo socat - UNIX- > CONNECT:/run/vpp/cli.sock; sudo netstat -ap | grep vpp; done > > This test is passing with "show version" and VPP is not restarted. > > > Root cause > ========== > The root cause seems to be: > > Thread 1 "vpp_main" received signal SIGSEGV, Segmentation fault. > 0x0000ffffbeb4f3d0 in format_vlib_pci_vpd ( > s=0xffff7fabe830 "0002:f9:00.0 0 15b3:1015 8.0 GT/s x8 > mlx5_core CX4121A - ConnectX-4 LX SFP28", args > =<optimized out>) > at /w/workspace/vpp-arm-merge-master- > ubuntu1804/src/vlib/pci/pci.c:230 > 230 /w/workspace/vpp-arm-merge-master-ubuntu1804/src/vlib/pci/pci.c: > No such file or directory. > (gdb) > Continuing. > > Thread 1 "vpp_main" received signal SIGABRT, Aborted. > __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 > 51 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory. > (gdb) > > > Issue started after MLX was installed into Taishan. > > > @Benoit Ganne (bganne) can you please help fixing the root cause? > > Thank you. > > Peter Mikus > Engineer - Software > Cisco Systems Limited
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#14734): https://lists.fd.io/g/vpp-dev/message/14734 Mute This Topic: https://lists.fd.io/mt/64332740/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-