Re: [vpp-dev] vpp crash in vlib_node_sync_stats

2021-09-27 Thread Benoit Ganne (bganne) via lists.fd.io
Hi,

Can you try to reproduce the issue in debug mode, or even better in debug mode 
with ASan enabled: 
https://fd.io/docs/vpp/master/troubleshooting/sanitizer.html#id2 ?

This looks like a memory corruption, but it is hard to tell anything else.

Best
Ben

> -Original Message-
> From: vpp-dev@lists.fd.io  On Behalf Of ??
> Sent: samedi 18 septembre 2021 11:08
> To: vpp-dev@lists.fd.io
> Subject: [vpp-dev] vpp crash in vlib_node_sync_stats
> 
> Does anyone see this stack before? I'm using af_xdp to receive packet for
> veth, veth has two queues.
> vpp version: 2106
> 
> Thread 1 "vpp_main" received signal SIGSEGV, Segmentation fault.
> vlib_node_sync_stats (vm=vm@entry=0x7fffb5e00680, n=0x7fffb671bc70) at
> /mnt/opensource/vpp/src/vlib/main.c:630
> 630   vlib_node_runtime_sync_stats (vm, rt, 0, 0, 0);
> (gdb) bt
> #0  vlib_node_sync_stats (vm=vm@entry=0x7fffb5e00680, n=0x7fffb671bc70) at
> /mnt/opensource/vpp/src/vlib/main.c:630
> #1  0x763ee865 in vlib_node_get_nodes (vm=vm@entry=0x0,
> max_threads=, max_threads@entry=4294967295,
> include_stats=include_stats@entry=1,
> barrier_sync=barrier_sync@entry=0,
> node_dupsp=node_dupsp@entry=0x7fffb0a06f68,
> stat_vmsp=stat_vmsp@entry=0x7fffb0a06f80) at
> /mnt/opensource/vpp/src/vlib/node.c:647
> #2  0x55564d69 in update_node_counters (sm=) at
> /mnt/opensource/vpp/src/vpp/stats/stat_segment.c:621
> #3  do_stat_segment_updates (vm=0x7fffb5e00680, sm=) at
> /mnt/opensource/vpp/src/vpp/stats/stat_segment.c:778
> #4  stat_segment_collector_process (vm=0x7fffb5e00680, rt=,
> f=) at /mnt/opensource/vpp/src/vpp/stats/stat_segment.c:873
> #5  0x763e6437 in vlib_process_bootstrap (_a=) at
> /mnt/opensource/vpp/src/vlib/main.c:1284
> #6  0x762fcee4 in clib_calljmp () at
> /mnt/opensource/vpp/src/vppinfra/longjmp.S:123
> #7  0x7fffb2d09d70 in ?? ()
> #8  0x763dd1ee in vlib_process_startup (vm=0x7fffb0a08000,
> p=0x20737365636f7270, f=0x0) at /mnt/opensource/vpp/src/vlib/main.c:1309
> #9  dispatch_process (vm=, p=, f=0x0,
> last_time_stamp=) at
> /mnt/opensource/vpp/src/vlib/main.c:1365
> #10 0x in ?? ()
> 
> (gdb) p *n
> $4 = {function = 0x794b9700794154, name = 0x793af800794a00  access memory at address 0x793af800794a00>, name_elog_string = 7918067,
> stats_total = {calls = 3402357128522,
> vectors = 3402357128522, clocks = 34135068636670782, suspends =
> 34048202923117826, max_clock = 34035648733712284, max_clock_n =
> 34035648733700162}, stats_last_clear = {
> calls = 34048202923106370, vectors = 33998621820653468, clocks =
> 34039303750868577, suspends = 34025001509779113, max_clock =
> 34023571285685848, max_clock_n = 34142378671018106},
>   type = 7952796, index = 7933852, runtime_index = 7924542, runtime_data =
> 0x78eb3e0078f6a9, flags = 57410, state = 120 'x', runtime_data_bytes = 0
> '\000', protocol_hint = 169 '\251',
>   n_errors = 120, scalar_size = 60222, vector_size = 120,
> error_heap_handle = 7927465, error_heap_index = 7924542, error_counters =
> 0x78eb3e00793658, next_node_names = 0x78f6a90078e042,
>   next_nodes = 0x78c99100790f9c, sibling_of = 0x794d2e00793e26  Cannot access memory at address 0x794d2e00793e26>, sibling_bitmap =
> 0x78e0420078eb3e,
>   n_vectors_by_next_node = 0x790f9c0078f6a9, next_slot_by_node =
> 0x78f6a90078c991, prev_node_bitmap = 0x78f6a90078eb3e, owner_node_index =
> 7943768, owner_next_index = 7921730,
>   format_buffer = 0x78d2870078eb3e, unformat_buffer = 0x78e0420078f6a9,
> format_trace = 0x78eb3e00790f9c, validate_frame = 0x790078c991,
>   state_string = 0x78cb0300795952  0x78cb0300795952>, node_fn_registrations = 0x78cb4d0078cb28}

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#20210): https://lists.fd.io/g/vpp-dev/message/20210
Mute This Topic: https://lists.fd.io/mt/85695514/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Re: [vpp-dev] #vpp #vnet os_panic for failed barrier timeout

2021-09-27 Thread Dave Barach
This is the key bit from thread 1:

 

#3  0x7f000c0ab051 in unix_signal_handler (signum=6, si=, 
uc=) at /usr/src/debug/vpp-21.01/src/vlib/unix/main.c:187

#4  0x7f000a8995f0 in __funlockfile (stream=0x10af) at 
../nptl/sysdeps/pthread/funlockfile.c:28

 

The thread barrier timeout is a side-effect. Please try to work out what’s 
causing “_funlockfile(...)” to generate SIGABRT...

 

D. 

 

 

From: vpp-dev@lists.fd.io  On Behalf Of 
satishse...@gmail.com
Sent: Monday, September 27, 2021 8:47 AM
To: vpp-dev@lists.fd.io
Subject: Re: [vpp-dev] #vpp #vnet os_panic for failed barrier timeout

 

Hi Dave,

We are seeing same issue in vpp 21.01

warning: Unable to find libthread_db matching inferior's thread library, thread 
debugging will not be available.

Core was generated by `/usr/bin/vpp -c /opt/ani/etc//startup_nrdufr2.conf'.

Program terminated with signal 6, Aborted.

#0  0x7f0009916337 in __bsd_signal (sig=4271, handler=0x10af) at 
../sysdeps/posix/signal.c:50

50 }

Missing separate debuginfos, use: debuginfo-install 
keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-46.el7.x86_64 
libcom_err-1.42.9-17.el7.x86_64 libgcc-4.8.5-44.el7.x86_64 
libselinux-2.5-15.el7.x86_64 libuuid-2.23.2-65.el7_9.1.x86_64 
numactl-libs-2.0.12-5.el7.x86_64 pcre-8.32-17.el7.x86_64 
zlib-1.2.7-18.el7.x86_64

(gdb) thread apply bt full

(gdb) thread apply all bt

 

Thread 3 (LWP 4622):

#0  0x7f00099dee63 in capset () at ../sysdeps/unix/syscall-template.S:83

#1  0x in ?? ()

 

Thread 2 (LWP 4623):

#0  vlib_worker_thread_barrier_check () at 
/usr/src/debug/vpp-21.01/src/vlib/threads.h:440

#1  vlib_main_or_worker_loop (is_main=0, vm=0x7effcc15db00) at 
/usr/src/debug/vpp-21.01/src/vlib/main.c:1812

#2  vlib_worker_loop (vm=0x7effcc15db00) at 
/usr/src/debug/vpp-21.01/src/vlib/main.c:2038

#3  0x7f000bf9eda0 in clib_calljmp () from /usr/lib64/libvppinfra.so.21.01

#4  0x7effc1174c00 in ?? ()

#5  0x7effc53b947d in eal_thread_loop.cold () from 
/usr/lib/vpp_plugins/dpdk_plugin.so

#6  0x in ?? ()

 

Thread 1 (LWP 4271):

#0  0x7f0009916337 in __bsd_signal (sig=4271, handler=0x10af) at 
../sysdeps/posix/signal.c:50

#1  0x7f0009917a28 in __GI_abort () at abort.c:79

#2  0x55f56215a5ca in os_exit () at 
/usr/src/debug/vpp-21.01/src/vpp/vnet/main.c:433

#3  0x7f000c0ab051 in unix_signal_handler (signum=6, si=, 
uc=) at /usr/src/debug/vpp-21.01/src/vlib/unix/main.c:187

#4  0x7f000a8995f0 in __funlockfile (stream=0x10af) at 
../nptl/sysdeps/pthread/funlockfile.c:28

#5  0x0001 in ?? ()

#6  0x in ?? ()

(gdb) 

-- 
Regards,
Satish Singh 


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#20209): https://lists.fd.io/g/vpp-dev/message/20209
Mute This Topic: https://lists.fd.io/mt/83739741/21656
Mute #vpp:https://lists.fd.io/g/vpp-dev/mutehashtag/vpp
Mute #vnet:https://lists.fd.io/g/vpp-dev/mutehashtag/vnet
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Re: [vpp-dev] #vpp #vnet os_panic for failed barrier timeout

2021-09-27 Thread satishsept7
Hi Dave,

We are seeing same issue in vpp 21.01

warning: Unable to find libthread_db matching inferior's thread library, thread 
debugging will not be available.
Core was generated by `/usr/bin/vpp -c /opt/ani/etc//startup_nrdufr2.conf'.
Program terminated with signal 6, Aborted.
#0  0x7f0009916337 in __bsd_signal (sig=4271, handler=0x10af) at 
../sysdeps/posix/signal.c:50
50 }
Missing separate debuginfos, use: debuginfo-install 
keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-46.el7.x86_64 
libcom_err-1.42.9-17.el7.x86_64 libgcc-4.8.5-44.el7.x86_64 
libselinux-2.5-15.el7.x86_64 libuuid-2.23.2-65.el7_9.1.x86_64 
numactl-libs-2.0.12-5.el7.x86_64 pcre-8.32-17.el7.x86_64 
zlib-1.2.7-18.el7.x86_64
(gdb) thread apply bt full
(gdb) thread apply all bt

Thread 3 (LWP 4622):
#0  0x7f00099dee63 in capset () at ../sysdeps/unix/syscall-template.S:83
#1  0x in ?? ()

Thread 2 (LWP 4623):
#0  vlib_worker_thread_barrier_check () at 
/usr/src/debug/vpp-21.01/src/vlib/threads.h:440
#1  vlib_main_or_worker_loop (is_main=0, vm=0x7effcc15db00) at 
/usr/src/debug/vpp-21.01/src/vlib/main.c:1812
#2  vlib_worker_loop (vm=0x7effcc15db00) at 
/usr/src/debug/vpp-21.01/src/vlib/main.c:2038
#3  0x7f000bf9eda0 in clib_calljmp () from /usr/lib64/libvppinfra.so.21.01
#4  0x7effc1174c00 in ?? ()
#5  0x7effc53b947d in eal_thread_loop.cold () from 
/usr/lib/vpp_plugins/dpdk_plugin.so
#6  0x in ?? ()

Thread 1 (LWP 4271):
#0  0x7f0009916337 in __bsd_signal (sig=4271, handler=0x10af) at 
../sysdeps/posix/signal.c:50
#1  0x7f0009917a28 in __GI_abort () at abort.c:79
#2  0x55f56215a5ca in os_exit () at 
/usr/src/debug/vpp-21.01/src/vpp/vnet/main.c:433
#3  0x7f000c0ab051 in unix_signal_handler (signum=6, si=, 
uc=) at /usr/src/debug/vpp-21.01/src/vlib/unix/main.c:187
#4  0x7f000a8995f0 in __funlockfile (stream=0x10af) at 
../nptl/sysdeps/pthread/funlockfile.c:28
#5  0x0001 in ?? ()
#6  0x in ?? ()
(gdb)
--
Regards,
Satish Singh

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#20208): https://lists.fd.io/g/vpp-dev/message/20208
Mute This Topic: https://lists.fd.io/mt/83739741/21656
Mute #vpp:https://lists.fd.io/g/vpp-dev/mutehashtag/vpp
Mute #vnet:https://lists.fd.io/g/vpp-dev/mutehashtag/vnet
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



[vpp-dev] Upcoming API trace improvements

2021-09-27 Thread Ole Troan
Filip and I have worked on cleaning up the API trace infrastructure.

https://gerrit.fd.io/r/c/vpp/+/32652

api: API trace improvements

Type: improvement

 * add support for JSON format in API trace
 * add ability to replay JSON API trace in both VPP and VAT2
 * use CRC for backward compatibility check during JSON API replay
 * fix API trace CLI (and remove duplicits)
 * remove custom dump
 * remove
vppapitrace.py

 * update docs accordingly


https://gerrit.fd.io/r/c/vpp/+/33819/1

api: binary-api-json command to call api from vpp cli

binary-api-json  
command uses the auto-generated JSON to/from API binary message
convertors to allow calling any API from the VPP command line.

Examples:
binary-api-json sw_interface_add_del_address {"sw_if_index": 1, "is_add": true, 
"del_all": false, "prefix": "
10.0.0.1/24
"}
binary-api-json sw_interface_dump {"sw_if_index": 4294967295, 
"name_filter_valid": false, "name_filter": ""}

DBGvpp# binary-api-json show_version {}
{
"_msgname": "show_version_reply",
"_crc": "c919bde1",
"retval":   0,
"program":  "vpe",
"version":  "22.02-rc0~3-g87ee3f86f",
"build_date":   "2021-09-24T20:23:34",
"build_directory":  "/vpp/master"
}



With these changes there should no longer be a need to manually write API code 
for tracing, replay or VAT. VAT2 and the new binary-api-json command in VPP is 
only using auto-generated code, so unless compatibility with VAT is desired no 
code need to be written.

Currently in the review process, expect them to go in within the next few days.

Cheers,
Ole


signature.asc
Description: Message signed with OpenPGP

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#20207): https://lists.fd.io/g/vpp-dev/message/20207
Mute This Topic: https://lists.fd.io/mt/85894984/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-