Martin Kennelly observes that even though this data is available to humans via the journal/log files, these aren't exactly easy for a developer to make any kind of behavioral inferences. This kind of log and counter would be useful when checking on system health to let us know that an Open vSwitch component is noticing some kind of system level hiccup.
Add a new coverage counter to track information on these events, and let a developer or system engineer know how long these events have occurred with some historical context. Reported-at: https://lists.linuxfoundation.org/pipermail/ovs-discuss/2023-June/052523.html Reported-by: Martin Kennelly <[email protected]> Signed-off-by: Aaron Conole <[email protected]> --- lib/timeval.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/lib/timeval.c b/lib/timeval.c index 193c7bab17..00e5f2a74d 100644 --- a/lib/timeval.c +++ b/lib/timeval.c @@ -40,6 +40,7 @@ #include "openvswitch/vlog.h" VLOG_DEFINE_THIS_MODULE(timeval); +COVERAGE_DEFINE(long_poll_interval); #if !defined(HAVE_CLOCK_GETTIME) typedef unsigned int clockid_t; @@ -645,6 +646,8 @@ log_poll_interval(long long int last_wakeup) struct rusage rusage; if (!getrusage_thread(&rusage)) { + COVERAGE_INC(long_poll_interval); + VLOG_WARN("Unreasonably long %lldms poll interval" " (%lldms user, %lldms system)", interval, -- 2.40.1 _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
