Re: [vpp-dev] vpp crash in vlib_node_sync_stats
Hi, Can you try to reproduce the issue in debug mode, or even better in debug mode with ASan enabled: https://fd.io/docs/vpp/master/troubleshooting/sanitizer.html#id2 ? This looks like a memory corruption, but it is hard to tell anything else. Best Ben > -Original Message- > From: vpp-dev@lists.fd.io On Behalf Of ?? > Sent: samedi 18 septembre 2021 11:08 > To: vpp-dev@lists.fd.io > Subject: [vpp-dev] vpp crash in vlib_node_sync_stats > > Does anyone see this stack before? I'm using af_xdp to receive packet for > veth, veth has two queues. > vpp version: 2106 > > Thread 1 "vpp_main" received signal SIGSEGV, Segmentation fault. > vlib_node_sync_stats (vm=vm@entry=0x7fffb5e00680, n=0x7fffb671bc70) at > /mnt/opensource/vpp/src/vlib/main.c:630 > 630 vlib_node_runtime_sync_stats (vm, rt, 0, 0, 0); > (gdb) bt > #0 vlib_node_sync_stats (vm=vm@entry=0x7fffb5e00680, n=0x7fffb671bc70) at > /mnt/opensource/vpp/src/vlib/main.c:630 > #1 0x763ee865 in vlib_node_get_nodes (vm=vm@entry=0x0, > max_threads=, max_threads@entry=4294967295, > include_stats=include_stats@entry=1, > barrier_sync=barrier_sync@entry=0, > node_dupsp=node_dupsp@entry=0x7fffb0a06f68, > stat_vmsp=stat_vmsp@entry=0x7fffb0a06f80) at > /mnt/opensource/vpp/src/vlib/node.c:647 > #2 0x55564d69 in update_node_counters (sm=) at > /mnt/opensource/vpp/src/vpp/stats/stat_segment.c:621 > #3 do_stat_segment_updates (vm=0x7fffb5e00680, sm=) at > /mnt/opensource/vpp/src/vpp/stats/stat_segment.c:778 > #4 stat_segment_collector_process (vm=0x7fffb5e00680, rt=, > f=) at /mnt/opensource/vpp/src/vpp/stats/stat_segment.c:873 > #5 0x763e6437 in vlib_process_bootstrap (_a=) at > /mnt/opensource/vpp/src/vlib/main.c:1284 > #6 0x762fcee4 in clib_calljmp () at > /mnt/opensource/vpp/src/vppinfra/longjmp.S:123 > #7 0x7fffb2d09d70 in ?? () > #8 0x763dd1ee in vlib_process_startup (vm=0x7fffb0a08000, > p=0x20737365636f7270, f=0x0) at /mnt/opensource/vpp/src/vlib/main.c:1309 > #9 dispatch_process (vm=, p=, f=0x0, > last_time_stamp=) at > /mnt/opensource/vpp/src/vlib/main.c:1365 > #10 0x in ?? () > > (gdb) p *n > $4 = {function = 0x794b9700794154, name = 0x793af800794a00 access memory at address 0x793af800794a00>, name_elog_string = 7918067, > stats_total = {calls = 3402357128522, > vectors = 3402357128522, clocks = 34135068636670782, suspends = > 34048202923117826, max_clock = 34035648733712284, max_clock_n = > 34035648733700162}, stats_last_clear = { > calls = 34048202923106370, vectors = 33998621820653468, clocks = > 34039303750868577, suspends = 34025001509779113, max_clock = > 34023571285685848, max_clock_n = 34142378671018106}, > type = 7952796, index = 7933852, runtime_index = 7924542, runtime_data = > 0x78eb3e0078f6a9, flags = 57410, state = 120 'x', runtime_data_bytes = 0 > '\000', protocol_hint = 169 '\251', > n_errors = 120, scalar_size = 60222, vector_size = 120, > error_heap_handle = 7927465, error_heap_index = 7924542, error_counters = > 0x78eb3e00793658, next_node_names = 0x78f6a90078e042, > next_nodes = 0x78c99100790f9c, sibling_of = 0x794d2e00793e26 Cannot access memory at address 0x794d2e00793e26>, sibling_bitmap = > 0x78e0420078eb3e, > n_vectors_by_next_node = 0x790f9c0078f6a9, next_slot_by_node = > 0x78f6a90078c991, prev_node_bitmap = 0x78f6a90078eb3e, owner_node_index = > 7943768, owner_next_index = 7921730, > format_buffer = 0x78d2870078eb3e, unformat_buffer = 0x78e0420078f6a9, > format_trace = 0x78eb3e00790f9c, validate_frame = 0x790078c991, > state_string = 0x78cb0300795952 0x78cb0300795952>, node_fn_registrations = 0x78cb4d0078cb28} -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#20210): https://lists.fd.io/g/vpp-dev/message/20210 Mute This Topic: https://lists.fd.io/mt/85695514/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] #vpp #vnet os_panic for failed barrier timeout
This is the key bit from thread 1: #3 0x7f000c0ab051 in unix_signal_handler (signum=6, si=, uc=) at /usr/src/debug/vpp-21.01/src/vlib/unix/main.c:187 #4 0x7f000a8995f0 in __funlockfile (stream=0x10af) at ../nptl/sysdeps/pthread/funlockfile.c:28 The thread barrier timeout is a side-effect. Please try to work out what’s causing “_funlockfile(...)” to generate SIGABRT... D. From: vpp-dev@lists.fd.io On Behalf Of satishse...@gmail.com Sent: Monday, September 27, 2021 8:47 AM To: vpp-dev@lists.fd.io Subject: Re: [vpp-dev] #vpp #vnet os_panic for failed barrier timeout Hi Dave, We are seeing same issue in vpp 21.01 warning: Unable to find libthread_db matching inferior's thread library, thread debugging will not be available. Core was generated by `/usr/bin/vpp -c /opt/ani/etc//startup_nrdufr2.conf'. Program terminated with signal 6, Aborted. #0 0x7f0009916337 in __bsd_signal (sig=4271, handler=0x10af) at ../sysdeps/posix/signal.c:50 50 } Missing separate debuginfos, use: debuginfo-install keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-46.el7.x86_64 libcom_err-1.42.9-17.el7.x86_64 libgcc-4.8.5-44.el7.x86_64 libselinux-2.5-15.el7.x86_64 libuuid-2.23.2-65.el7_9.1.x86_64 numactl-libs-2.0.12-5.el7.x86_64 pcre-8.32-17.el7.x86_64 zlib-1.2.7-18.el7.x86_64 (gdb) thread apply bt full (gdb) thread apply all bt Thread 3 (LWP 4622): #0 0x7f00099dee63 in capset () at ../sysdeps/unix/syscall-template.S:83 #1 0x in ?? () Thread 2 (LWP 4623): #0 vlib_worker_thread_barrier_check () at /usr/src/debug/vpp-21.01/src/vlib/threads.h:440 #1 vlib_main_or_worker_loop (is_main=0, vm=0x7effcc15db00) at /usr/src/debug/vpp-21.01/src/vlib/main.c:1812 #2 vlib_worker_loop (vm=0x7effcc15db00) at /usr/src/debug/vpp-21.01/src/vlib/main.c:2038 #3 0x7f000bf9eda0 in clib_calljmp () from /usr/lib64/libvppinfra.so.21.01 #4 0x7effc1174c00 in ?? () #5 0x7effc53b947d in eal_thread_loop.cold () from /usr/lib/vpp_plugins/dpdk_plugin.so #6 0x in ?? () Thread 1 (LWP 4271): #0 0x7f0009916337 in __bsd_signal (sig=4271, handler=0x10af) at ../sysdeps/posix/signal.c:50 #1 0x7f0009917a28 in __GI_abort () at abort.c:79 #2 0x55f56215a5ca in os_exit () at /usr/src/debug/vpp-21.01/src/vpp/vnet/main.c:433 #3 0x7f000c0ab051 in unix_signal_handler (signum=6, si=, uc=) at /usr/src/debug/vpp-21.01/src/vlib/unix/main.c:187 #4 0x7f000a8995f0 in __funlockfile (stream=0x10af) at ../nptl/sysdeps/pthread/funlockfile.c:28 #5 0x0001 in ?? () #6 0x in ?? () (gdb) -- Regards, Satish Singh -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#20209): https://lists.fd.io/g/vpp-dev/message/20209 Mute This Topic: https://lists.fd.io/mt/83739741/21656 Mute #vpp:https://lists.fd.io/g/vpp-dev/mutehashtag/vpp Mute #vnet:https://lists.fd.io/g/vpp-dev/mutehashtag/vnet Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] #vpp #vnet os_panic for failed barrier timeout
Hi Dave, We are seeing same issue in vpp 21.01 warning: Unable to find libthread_db matching inferior's thread library, thread debugging will not be available. Core was generated by `/usr/bin/vpp -c /opt/ani/etc//startup_nrdufr2.conf'. Program terminated with signal 6, Aborted. #0 0x7f0009916337 in __bsd_signal (sig=4271, handler=0x10af) at ../sysdeps/posix/signal.c:50 50 } Missing separate debuginfos, use: debuginfo-install keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-46.el7.x86_64 libcom_err-1.42.9-17.el7.x86_64 libgcc-4.8.5-44.el7.x86_64 libselinux-2.5-15.el7.x86_64 libuuid-2.23.2-65.el7_9.1.x86_64 numactl-libs-2.0.12-5.el7.x86_64 pcre-8.32-17.el7.x86_64 zlib-1.2.7-18.el7.x86_64 (gdb) thread apply bt full (gdb) thread apply all bt Thread 3 (LWP 4622): #0 0x7f00099dee63 in capset () at ../sysdeps/unix/syscall-template.S:83 #1 0x in ?? () Thread 2 (LWP 4623): #0 vlib_worker_thread_barrier_check () at /usr/src/debug/vpp-21.01/src/vlib/threads.h:440 #1 vlib_main_or_worker_loop (is_main=0, vm=0x7effcc15db00) at /usr/src/debug/vpp-21.01/src/vlib/main.c:1812 #2 vlib_worker_loop (vm=0x7effcc15db00) at /usr/src/debug/vpp-21.01/src/vlib/main.c:2038 #3 0x7f000bf9eda0 in clib_calljmp () from /usr/lib64/libvppinfra.so.21.01 #4 0x7effc1174c00 in ?? () #5 0x7effc53b947d in eal_thread_loop.cold () from /usr/lib/vpp_plugins/dpdk_plugin.so #6 0x in ?? () Thread 1 (LWP 4271): #0 0x7f0009916337 in __bsd_signal (sig=4271, handler=0x10af) at ../sysdeps/posix/signal.c:50 #1 0x7f0009917a28 in __GI_abort () at abort.c:79 #2 0x55f56215a5ca in os_exit () at /usr/src/debug/vpp-21.01/src/vpp/vnet/main.c:433 #3 0x7f000c0ab051 in unix_signal_handler (signum=6, si=, uc=) at /usr/src/debug/vpp-21.01/src/vlib/unix/main.c:187 #4 0x7f000a8995f0 in __funlockfile (stream=0x10af) at ../nptl/sysdeps/pthread/funlockfile.c:28 #5 0x0001 in ?? () #6 0x in ?? () (gdb) -- Regards, Satish Singh -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#20208): https://lists.fd.io/g/vpp-dev/message/20208 Mute This Topic: https://lists.fd.io/mt/83739741/21656 Mute #vpp:https://lists.fd.io/g/vpp-dev/mutehashtag/vpp Mute #vnet:https://lists.fd.io/g/vpp-dev/mutehashtag/vnet Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
[vpp-dev] Upcoming API trace improvements
Filip and I have worked on cleaning up the API trace infrastructure. https://gerrit.fd.io/r/c/vpp/+/32652 api: API trace improvements Type: improvement * add support for JSON format in API trace * add ability to replay JSON API trace in both VPP and VAT2 * use CRC for backward compatibility check during JSON API replay * fix API trace CLI (and remove duplicits) * remove custom dump * remove vppapitrace.py * update docs accordingly https://gerrit.fd.io/r/c/vpp/+/33819/1 api: binary-api-json command to call api from vpp cli binary-api-json command uses the auto-generated JSON to/from API binary message convertors to allow calling any API from the VPP command line. Examples: binary-api-json sw_interface_add_del_address {"sw_if_index": 1, "is_add": true, "del_all": false, "prefix": " 10.0.0.1/24 "} binary-api-json sw_interface_dump {"sw_if_index": 4294967295, "name_filter_valid": false, "name_filter": ""} DBGvpp# binary-api-json show_version {} { "_msgname": "show_version_reply", "_crc": "c919bde1", "retval": 0, "program": "vpe", "version": "22.02-rc0~3-g87ee3f86f", "build_date": "2021-09-24T20:23:34", "build_directory": "/vpp/master" } With these changes there should no longer be a need to manually write API code for tracing, replay or VAT. VAT2 and the new binary-api-json command in VPP is only using auto-generated code, so unless compatibility with VAT is desired no code need to be written. Currently in the review process, expect them to go in within the next few days. Cheers, Ole signature.asc Description: Message signed with OpenPGP -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#20207): https://lists.fd.io/g/vpp-dev/message/20207 Mute This Topic: https://lists.fd.io/mt/85894984/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-