Thanks for pointing out elog feature.

May I understand why my suggestion is not good enough? (for my
understanding only).
"show runtime max" is very useful if we can fix this.

On Fri, Mar 27, 2020 at 2:32 PM Dave Barach (dbarach) <dbar...@cisco.com>
wrote:

> Running for 2.5 seconds in a single graph node is certainly a problem.
>
>
>
> See if you can use the vppinfra “file” infrastructure for non-blocking
> file I/O. Search for clib_file_add(...). You can use a
> VLIB_NODE_TYPE_PROCESS node, suspend the process node after a realistic
> amount of runtime [measured in microseconds!], and cause the process to
> resume by sending it an event from a clib file “read_ready” callback.
>
>
>
> To see how long a given graph node invocation runs, you can use the vpp
> event logger. It will deal with nodes which run for a year, but only if
> you’re willing to wait that long. 😉...
>
>
>
> “elog trace dispatch”
>
> “event-logger save <filename>”
>
>
>
> Refer to this page to see how to view the resulting event log file:
> https://fd.io/docs/vpp/master/gettingstarted/developers/eventviewer.html
>
>
>
> HTH... Dave
>
>
>
> *From:* Nagaraju Vemuri <nagarajuiit...@gmail.com>
> *Sent:* Friday, March 27, 2020 4:41 PM
> *To:* Dave Barach (dbarach) <dbar...@cisco.com>
> *Subject:* Re: [vpp-dev] clocks stored in vlib_node_runtime_t #vpp
> #counters
>
>
>
> We are registering a node, where we do little heavy file operations and it
> is taking 2.5sec or so.
>
> We are going to fix it.
>
>
>
> This infra is supposed to catch any such wrongly coded node.
>
>
>
> If max_clock logic is wrong, we will never catch nodes that (are
> incorrectly coded) take more than 2 seconds.
>
> As per current code, we are always storing max_clock from 32 bit into 64
> bit, so effectively it is always within 32 bit value.
>
>
>
> My suggestion would solve the infra issue and catch such culprits?
>
>
>
> Thanks,
>
> Nagaraju
>
>
>
> On Fri, Mar 27, 2020 at 1:32 PM Dave Barach (dbarach) <dbar...@cisco.com>
> wrote:
>
> The assumption in step 1 won’t hold true in real life. I’m not claiming
> that it’s “impossible,” just that it will not happen with anything like
> properly-coded node dispatch functions.
>
>
>
> A full dispatch circuit across multiple graph nodes - with full frames -
> doesn’t exceed 1ms or so. None of the individual nodes run for anywhere
> near that long.
>
>
>
> HTH... Dave
>
>
>
> *From:* Nagaraju Vemuri <nagarajuiit...@gmail.com>
> *Sent:* Friday, March 27, 2020 3:09 PM
> *To:* Dave Barach (dbarach) <dbar...@cisco.com>
> *Cc:* vpp-dev@lists.fd.io
> *Subject:* Re: [vpp-dev] clocks stored in vlib_node_runtime_t #vpp
> #counters
>
>
>
> Hi Dave,
>
>
>
> I saw that code. Please excuse me if I'm wrong with below issue:
>
> Step #1: vlib_node_runtime_update_stats() receives n_clocks (uword),
> assume this n_clocks has value bigger than 32 bit can hold.
>
> Step #2: We are setting node->max_clock (note that here max_clock already
> overflowed as n_clocks is big value)
>
> Step #3: Now we call vlib_node_runtime_sync_stats() and pass node as
> argument along with n_clocks.
>
> Step #4: In vlib_node_runtime_sync_stats(), we
> set n->stats_total.max_clock = r->max_clock; from node(32 bit field) which
> already overflowed.
>
>
>
> I think we should have the logic like below
> inside vlib_node_runtime_sync_stats().
>
> n->stats_total.max_clock = n_clocks > n->stats_total.max_clock ? n_clocks
> : n->stats_total.max_clock;   # all variables are 64 bit here
>
>
>
> Thanks,
>
> Nagaraju
>
>
>
> On Fri, Mar 27, 2020 at 11:11 AM Dave Barach (dbarach) <dbar...@cisco.com>
> wrote:
>
> vlib_node_runtime_update_stats(...)
>
>
>
> *From:* Nagaraju Vemuri <nagarajuiit...@gmail.com>
> *Sent:* Friday, March 27, 2020 12:54 PM
> *To:* Dave Barach (dbarach) <dbar...@cisco.com>
> *Cc:* vpp-dev@lists.fd.io
> *Subject:* Re: [vpp-dev] clocks stored in vlib_node_runtime_t #vpp
> #counters
>
>
>
> Thanks for your reply Dave.
>
>
>
> Can you please point me to the code where we copy into 64 bit counters
> when 32 bit overflows?
>
>
>
> When we issue the command "show runtime max", do we print the data from 64
> bit counters?
>
>
>
> I made max_clock variable 64 bit type locally to experiment.
>
> And found different output from "show runtime max".
>
> Max clock was 7.49e9 with 64 bit field, whereas it was within 4e9(because
> of 32 bit limitation) with 32 bit field.
>
>
>
> Regards,
>
> Nagaraju
>
>
>
> On Fri, Mar 27, 2020 at 4:52 AM Dave Barach (dbarach) <dbar...@cisco.com>
> wrote:
>
> We scrape the 32-bit node-runtime counters back into the 64 bit node
> counters when they overflow.
>
>
>
> It should take a good long time for a 32-bit counter to overflow – O(1
> seconds’ worth of node runtime) – we do this to reduce the cache footprint
> of the node-runtime.
>
>
>
> HTH... Dave
>
>
>
> *From:* vpp-dev@lists.fd.io <vpp-dev@lists.fd.io> *On Behalf Of *
> nagarajuiit...@gmail.com
> *Sent:* Friday, March 27, 2020 1:59 AM
> *To:* vpp-dev@lists.fd.io
> *Subject:* [vpp-dev] clocks stored in vlib_node_runtime_t #vpp #counters
>
>
>
> Hi,
>
> I see that clib_cpu_time_now() returns u64 type data.
> And we are updating max clocks spent by a node in vlib_node_runtime_t
> using these timestamps.
>
> Why are we using u32 types in vlib_node_runtime_t?
> Because of this, it can only hold data within 2 seconds and anything above
> 32 bit range overflows in our system.
>
> Really appreciate your help.
>
> Thanks,
> Nagaraju
>
>
>
>
> --
>
> Thanks,
> Nagaraju Vemuri
>
>
>
>
> --
>
> Thanks,
> Nagaraju Vemuri
>
>
>
>
> --
>
> Thanks,
> Nagaraju Vemuri
>


-- 
Thanks,
Nagaraju Vemuri
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#15909): https://lists.fd.io/g/vpp-dev/message/15909
Mute This Topic: https://lists.fd.io/mt/72581795/21656
Mute #vpp: https://lists.fd.io/mk?hashtag=vpp&subid=1480452
Mute #counters: https://lists.fd.io/mk?hashtag=counters&subid=1480452
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to