Thanks Dave, will check it out. -Rajith
On Tue, Nov 17, 2020 at 8:40 PM <[email protected]> wrote: > Let’s be clear: you’re seeing a crash in a *modified fork* of vpp-19.08. > I’ve never seen such a crash myself, nor has one such been reported by > anyone else to my knowledge. > > > > That having been written, all signs point to the volatile int ** vector > vl_api_queue_cursizes having had an accident: > > > > static void > > memclnt_queue_callback (vlib_main_t * vm) > > { > > <snip> > > for (i = 0; i < vec_len (vl_api_queue_cursizes); i++) > > { > > if (*vl_api_queue_cursizes[i]) > > { > > vm->queue_signal_pending = 1; > > vm->api_queue_nonempty = 1; > > vlib_process_signal_event (vm, vl_api_clnt_node.index, > > /* event_type */ QUEUE_SIGNAL_EVENT, > > /* event_data */ 0); > > break; > > } > > } > > <snip> > > } > > > > Try a debug image. Try capturing “i”, and the value > vl_api_queue_cursizes[i] before dereferencing as a pointer. Add a couple of > global variables with names which won’t collide with anything else: > > > > void int oingo_save_i; > > void oingo_save_cursizep; > > > > In the loop, set: > > oingo_save_i = i; > > oingo_save_cursizep = vl_api_queue_cursizes[i]; > > > > if(*vl_api_queue_cursizes[i]) > > <etc> > > > > Capture a coredump. It should be obvious why the reference blows up. If > you can, change your custom signal handler so that the faulting virtual > address is as obvious as possible. > > > > Beyond that, you’re on your own. > > > > HTH... Dave > > > > *From:* [email protected] <[email protected]> *On Behalf Of *Rajith > PR via lists.fd.io > *Sent:* Tuesday, November 17, 2020 7:03 AM > *To:* vpp-dev <[email protected]> > *Subject:* [vpp-dev]: Crash in memclnt_queue_callback(). > > > > Hi All, > > > > We are seeing a random crash in *VPP-19.08*. The crash is occurring in > memclnt_queue_callback > and it is in code that we are not using. Any pointers to fix the crash > would be helpful. > > > > *Complete Call Stack:* > > > > Thread 1 (Thread 0x7fe728f43d00 (LWP 189)): > > #0 0x00007fe728049492 in __GI___waitpid (pid=732, > stat_loc=stat_loc@entry=0x7fe6f9ebeed8, options=options@entry=0) > > at ../sysdeps/unix/sysv/linux/waitpid.c:30 > > #1 0x00007fe727fb4177 in do_system (line=<optimized out>) at > ../sysdeps/posix/system.c:149 > > #2 0x00007fe728ad6457 in bd_signal_handler_cb (signo=11) at > /development/librtbrickinfra/bd/src/bd.c:770 > > #3 0x00007fe71c90fbf7 in rtb_bd_signal_handler (signo=11) at > /development/libvpp/src/vlib/unix/main.c:80 > > #4 0x00007fe71c90ff92 in unix_signal_handler (signum=11, si=0x7fe6f9ebf7b0, > uc=0x7fe6f9ebf680) > > at /development/libvpp/src/vlib/unix/main.c:180 > > #5 <signal handler called> > > #6 memclnt_queue_callback (vm=0x7fe71cb49e80 <vlib_global_main>) at > /development/libvpp/src/vlibmemory/memory_api.c:96 > > #7 0x00007fe71c8a9258 in vlib_main_or_worker_loop (vm=0x7fe71cb49e80 > <vlib_global_main>, is_main=1) > > at /development/libvpp/src/vlib/main.c:1799 > > #8 0x00007fe71c8a9f9d in vlib_main_loop (vm=0x7fe71cb49e80 > <vlib_global_main>) at /development/libvpp/src/vlib/main.c:1982 > > #9 0x00007fe71c8aac7b in vlib_main (vm=0x7fe71cb49e80 <vlib_global_main>, > input=0x7fe6f9ebffb0) at /development/libvpp/src/vlib/main.c:2209 > > #10 0x00007fe71c911745 in thread0 (arg=140630595772032) at > /development/libvpp/src/vlib/unix/main.c:666 > > #11 0x00007fe71c568560 in clib_calljmp () from > /usr/local/lib/libvppinfra.so.1.0.1 > > #12 0x00007ffe85672480 in ?? () > > #13 0x00007fe71c911cbb in vlib_unix_main (argc=42, argv=0x563be4aaa5a0) at > /development/libvpp/src/vlib/unix/main.c:736 > > #14 0x00007fe71e0bc9eb in rtb_vpp_core_init (argc=42, argv=0x563be4aaa5a0) at > /development/libvpp/src/vpp/vnet/main.c:483 > > #15 0x00007fe71e18fba2 in rtb_vpp_main () at > /development/libvpp/src/vpp/rtbrick/rtb_vpp_main.c:113 > > #16 0x00007fe728ad5e46 in bd_load_daemon_lib (dmn_lib_cfg=0x7fe728cf2820 > <bd_json_global+21408>) > > ---Type <return> to continue, or q <return> to quit--- > > at /development/librtbrickinfra/bd/src/bd.c:627 > > #17 0x00007fe728ad5ef1 in bd_load_all_daemon_libs () at > /development/librtbrickinfra/bd/src/bd.c:646 > > #18 0x00007fe728ad7362 in bd_start_process () at > /development/librtbrickinfra/bd/src/bd.c:1128 > > #19 0x00007fe72583c860 in bds_bd_init () at > /development/librtbrickinfra/libbds/code/bds/src/bds.c:657 > > #20 0x00007fe7258c8a30 in pubsub_bd_init_expiry (data=0x0) at > /development/librtbrickinfra/libbds/code/pubsub/src/pubsub_helper.c:1444 > > #21 0x00007fe7285d6640 in timer_dispatch (item=0x563be68209b0, > p=QB_LOOP_HIGH) at /development/librtbrickinfra/libqb/lib/loop_timerlist.c:56 > > #22 0x00007fe7285d25d6 in qb_loop_run_level (level=0x563be47a17a0) at > /development/librtbrickinfra/libqb/lib/loop.c:43 > > #23 0x00007fe7285d2d4b in qb_loop_run (lp=0x563be47a1730) at > /development/librtbrickinfra/libqb/lib/loop.c:210 > > #24 0x00007fe7285e461e in lib_qb_service_start_event_loop () at > /development/librtbrickinfra/libqb/lib/wrapper/lib_qb_service.c:257 > > #25 0x0000563be3d9f153 in main () > > (gdb) > > > > *Code Snippet:* > > > > 94 for (i = 0; i < vec_len (vl_api_queue_cursizes); i++) > > 95 { > 96 if (*vl_api_queue_cursizes[i]) > <---------------------------------Crashed here > 97 { > 98 vm->queue_signal_pending = 1; > 99 vm->api_queue_nonempty = 1; > 100 vlib_process_signal_event (vm, vl_api_clnt_node.index, > 101 /* event_type */ QUEUE_SIGNAL_EVENT, > 102 /* event_data */ 0); > 103 break; > 104 } > 105 } > > > > Thanks, > > Rajith > >
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#18060): https://lists.fd.io/g/vpp-dev/message/18060 Mute This Topic: https://lists.fd.io/mt/78314224/21656 Group Owner: [email protected] Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [[email protected]] -=-=-=-=-=-=-=-=-=-=-=-
