Running for 2.5 seconds in a single graph node is certainly a problem. See if you can use the vppinfra “file” infrastructure for non-blocking file I/O. Search for clib_file_add(...). You can use a VLIB_NODE_TYPE_PROCESS node, suspend the process node after a realistic amount of runtime [measured in microseconds!], and cause the process to resume by sending it an event from a clib file “read_ready” callback.
To see how long a given graph node invocation runs, you can use the vpp event logger. It will deal with nodes which run for a year, but only if you’re willing to wait that long. 😉... “elog trace dispatch” “event-logger save <filename>” Refer to this page to see how to view the resulting event log file: https://fd.io/docs/vpp/master/gettingstarted/developers/eventviewer.html HTH... Dave From: Nagaraju Vemuri <nagarajuiit...@gmail.com> Sent: Friday, March 27, 2020 4:41 PM To: Dave Barach (dbarach) <dbar...@cisco.com> Subject: Re: [vpp-dev] clocks stored in vlib_node_runtime_t #vpp #counters We are registering a node, where we do little heavy file operations and it is taking 2.5sec or so. We are going to fix it. This infra is supposed to catch any such wrongly coded node. If max_clock logic is wrong, we will never catch nodes that (are incorrectly coded) take more than 2 seconds. As per current code, we are always storing max_clock from 32 bit into 64 bit, so effectively it is always within 32 bit value. My suggestion would solve the infra issue and catch such culprits? Thanks, Nagaraju On Fri, Mar 27, 2020 at 1:32 PM Dave Barach (dbarach) <dbar...@cisco.com<mailto:dbar...@cisco.com>> wrote: The assumption in step 1 won’t hold true in real life. I’m not claiming that it’s “impossible,” just that it will not happen with anything like properly-coded node dispatch functions. A full dispatch circuit across multiple graph nodes - with full frames - doesn’t exceed 1ms or so. None of the individual nodes run for anywhere near that long. HTH... Dave From: Nagaraju Vemuri <nagarajuiit...@gmail.com<mailto:nagarajuiit...@gmail.com>> Sent: Friday, March 27, 2020 3:09 PM To: Dave Barach (dbarach) <dbar...@cisco.com<mailto:dbar...@cisco.com>> Cc: vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io> Subject: Re: [vpp-dev] clocks stored in vlib_node_runtime_t #vpp #counters Hi Dave, I saw that code. Please excuse me if I'm wrong with below issue: Step #1: vlib_node_runtime_update_stats() receives n_clocks (uword), assume this n_clocks has value bigger than 32 bit can hold. Step #2: We are setting node->max_clock (note that here max_clock already overflowed as n_clocks is big value) Step #3: Now we call vlib_node_runtime_sync_stats() and pass node as argument along with n_clocks. Step #4: In vlib_node_runtime_sync_stats(), we set n->stats_total.max_clock = r->max_clock; from node(32 bit field) which already overflowed. I think we should have the logic like below inside vlib_node_runtime_sync_stats(). n->stats_total.max_clock = n_clocks > n->stats_total.max_clock ? n_clocks : n->stats_total.max_clock; # all variables are 64 bit here Thanks, Nagaraju On Fri, Mar 27, 2020 at 11:11 AM Dave Barach (dbarach) <dbar...@cisco.com<mailto:dbar...@cisco.com>> wrote: vlib_node_runtime_update_stats(...) From: Nagaraju Vemuri <nagarajuiit...@gmail.com<mailto:nagarajuiit...@gmail.com>> Sent: Friday, March 27, 2020 12:54 PM To: Dave Barach (dbarach) <dbar...@cisco.com<mailto:dbar...@cisco.com>> Cc: vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io> Subject: Re: [vpp-dev] clocks stored in vlib_node_runtime_t #vpp #counters Thanks for your reply Dave. Can you please point me to the code where we copy into 64 bit counters when 32 bit overflows? When we issue the command "show runtime max", do we print the data from 64 bit counters? I made max_clock variable 64 bit type locally to experiment. And found different output from "show runtime max". Max clock was 7.49e9 with 64 bit field, whereas it was within 4e9(because of 32 bit limitation) with 32 bit field. Regards, Nagaraju On Fri, Mar 27, 2020 at 4:52 AM Dave Barach (dbarach) <dbar...@cisco.com<mailto:dbar...@cisco.com>> wrote: We scrape the 32-bit node-runtime counters back into the 64 bit node counters when they overflow. It should take a good long time for a 32-bit counter to overflow – O(1 seconds’ worth of node runtime) – we do this to reduce the cache footprint of the node-runtime. HTH... Dave From: vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io> <vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io>> On Behalf Of nagarajuiit...@gmail.com<mailto:nagarajuiit...@gmail.com> Sent: Friday, March 27, 2020 1:59 AM To: vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io> Subject: [vpp-dev] clocks stored in vlib_node_runtime_t #vpp #counters Hi, I see that clib_cpu_time_now() returns u64 type data. And we are updating max clocks spent by a node in vlib_node_runtime_t using these timestamps. Why are we using u32 types in vlib_node_runtime_t? Because of this, it can only hold data within 2 seconds and anything above 32 bit range overflows in our system. Really appreciate your help. Thanks, Nagaraju -- Thanks, Nagaraju Vemuri -- Thanks, Nagaraju Vemuri -- Thanks, Nagaraju Vemuri
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#15908): https://lists.fd.io/g/vpp-dev/message/15908 Mute This Topic: https://lists.fd.io/mt/72581795/21656 Mute #vpp: https://lists.fd.io/mk?hashtag=vpp&subid=1480452 Mute #counters: https://lists.fd.io/mk?hashtag=counters&subid=1480452 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-