Hi Chetan, In our case we are observing this issue occasionally exact steps to recreate the issue are not known. I made changes to our process node as suggested by dave and with these changes trying to recreate the issue.
Soon I will update my results and findings in this mail thread. Thanks and Regards, Sudhir On Fri, Mar 3, 2023 at 12:37 PM chetan bhasin <chetan.bhasin...@gmail.com> wrote: > Hi Sudhir, > > Is your issue resolved? > > Actually we are facing same issue on vpp.2106. > In our case "api-rx-ring" is not getting called. > in our usecase workers are calling some functions in main-thread context > leading to RPC message and memory is allocated from api section. > This leads to Api-segment memory is used fully and leads to crash. > > Thanks, > Chetan > > > On Mon, Feb 20, 2023, 18:24 Sudhir CR via lists.fd.io <sudhir= > rtbrick....@lists.fd.io> wrote: > >> Hi Dave, >> Thank you very much for your inputs. I will try this out and get back to >> you with the results. >> >> Regards, >> Sudhir >> >> On Mon, Feb 20, 2023 at 6:01 PM Dave Barach <v...@barachs.net> wrote: >> >>> Please try something like this, to eliminate the possibility that some >>> bit of code is sending this process an event. It’s not a good idea to skip >>> the vec_reset_length (event_data) step. >>> >>> >>> >>> while (1) >>> >>> { >>> >>> uword event_type, * event_data = 0; >>> >>> int i; >>> >>> >>> >>> vlib_process_wait_for_event_or_clock (vm, 1e-2 /* 10 ms */); >>> >>> >>> >>> event_type = vlib_process_get_events (vm, &event_data); >>> >>> >>> >>> switch (event_type) { >>> >>> case ~0: /* handle timer expirations */ >>> >>> rtb_event_loop_run_once (); >>> >>> break; >>> >>> >>> >>> default: /* bug! */ >>> >>> ASSERT (0); >>> >>> } >>> >>> >>> >>> vec_reset_length(event_data); >>> >>> } >>> >>> >>> >>> *From:* vpp-dev@lists.fd.io <vpp-dev@lists.fd.io> *On Behalf Of *Sudhir >>> CR via lists.fd.io >>> *Sent:* Monday, February 20, 2023 4:02 AM >>> *To:* vpp-dev@lists.fd.io >>> *Subject:* Re: [vpp-dev] process node suspended indefinitely >>> >>> >>> >>> Hi Dave, >>> Thank you for your response and help. >>> >>> >>> >>> Please find the additional details below. >>> >>> VPP Version *21.10* >>> >>> >>> We are creating a process node* rtb-vpp-epoll-process *to handle >>> control plane events like interface add/delete, route add/delete. >>> This process node waits for *10ms* of time (Not Interested in any >>> events ) once 10ms is expired it will process control plane events >>> mentioned above. >>> >>> code snippet looks like below >>> >>> >>> >>> ``` >>> >>> static uword >>> rtb_vpp_epoll_process (vlib_main_t *vm, >>> vlib_node_runtime_t *rt, >>> vlib_frame_t *f) >>> { >>> >>> ... >>> ... >>> while (1) { >>> vlib_process_wait_for_event_or_clock (vm, 10e-3); >>> vlib_process_get_events (vm, NULL); >>> >>> rtb_event_loop_run_once(); *<---- controlplane events >>> handling* >>> } >>> } >>> ``` >>> >>> What we observed is that sometimes (when there is a high controlplane >>> load like request to install more routes) "rtb-vpp-epoll-process" is >>> suspended and not scheduled furever. this we found by using "show runtime >>> rtb-vpp-epoll-process"* (*in "show runtime rtb-vpp-epoll-process" >>> command output suspends counter is not incrementing.) >>> >>> *show runtime output in working case :* >>> >>> >>> ``` >>> DBGvpp# show runtime rtb-vpp-epoll-process >>> Name State Calls Vectors >>> *Suspends* Clocks Vectors/Call >>> rtb-vpp-epoll-process any wait 0 >>> 0 *192246* 1.91e6 0.00 >>> DBGvpp# >>> >>> DBGvpp# show runtime rtb-vpp-epoll-process >>> Name State Calls Vectors >>> *Suspends* Clocks Vectors/Call >>> rtb-vpp-epoll-process any wait 0 >>> 0 *193634* 1.89e6 0.00 >>> DBGvpp# >>> >>> ``` >>> >>> >>> *show runtime output in issue case :```* >>> >>> DBGvpp# show runtime rtb-vpp-epoll-process >>> >>> Name State Calls Vectors >>> *Suspends* Clocks Vectors/Call >>> >>> rtb-vpp-epoll-process any wait 0 0 >>> *81477* 7.08e6 0.00 >>> >>> DBGvpp# show runtime rtb-vpp-epoll-process >>> >>> Name State Calls Vectors >>> *Suspends * Clocks Vectors/Call >>> >>> rtb-vpp-epoll-process any wait 0 0 >>> *81477* 7.08e6 0.00 >>> >>> *```* >>> >>> Other process nodes like lldp-process, >>> ip4-neighbor-age-process, ip6-ra-process running without any issue. only >>> "rtb-vpp-epoll-process" process node suspended forever. >>> >>> >>> >>> Please let me know if any additional information is required. >>> >>> Hi Jinsh, >>> Thanks for pointing me to the issue you faced. The issue I am facing >>> looks similar. >>> I will verify with the given patch. >>> >>> >>> Thanks and Regards, >>> >>> Sudhir >>> >>> >>> >>> On Sun, Feb 19, 2023 at 6:19 AM jinsh11 <jins...@chinatelecom.cn> wrote: >>> >>> HI: >>> >>> >>> - I have the same problem, >>> >>> bfd process node stop running. I raised this issue, >>> >>> https://lists.fd.io/g/vpp-dev/message/22380 >>> I think there is a problem with the porcess scheduling module when using >>> the time wheel. >>> >>> >>> >>> >>> >>> NOTICE TO RECIPIENT This e-mail message and any attachments are >>> confidential and may be privileged. If you received this e-mail in error, >>> any review, use, dissemination, distribution, or copying of this e-mail is >>> strictly prohibited. Please notify us immediately of the error by return >>> e-mail and please delete this message from your system. For more >>> information about Rtbrick, please visit us at www.rtbrick.com >>> >>> >>> >>> >> NOTICE TO RECIPIENT This e-mail message and any attachments are >> confidential and may be privileged. If you received this e-mail in error, >> any review, use, dissemination, distribution, or copying of this e-mail is >> strictly prohibited. Please notify us immediately of the error by return >> e-mail and please delete this message from your system. For more >> information about Rtbrick, please visit us at www.rtbrick.com >> >> >> >> > > > -- NOTICE TO RECIPIENT This e-mail message and any attachments are confidential and may be privileged. If you received this e-mail in error, any review, use, dissemination, distribution, or copying of this e-mail is strictly prohibited. Please notify us immediately of the error by return e-mail and please delete this message from your system. For more information about Rtbrick, please visit us at www.rtbrick.com <http://www.rtbrick.com>
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#22659): https://lists.fd.io/g/vpp-dev/message/22659 Mute This Topic: https://lists.fd.io/mt/97032803/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-