Martin Kennelly <[email protected]> writes:
> Hey ovs community,
>
> I am a developer working on ovn-kubernetes and I want to programmatically
> consume long poll information
> i.e:
> ovs|00211|timeval(handler25)|WARN|Unreasonably long 52388ms poll interval
> (752ms user, 209ms system)
>
> This is currently exposed via journal logs but it's not practical to consume
> it there programmatically and I was
> hoping you could add it to coverage metrics.
I think it could be useful. I do want to be careful about exposing
these kinds of data in a way that could be misinterpreted. Already,
that log in particular gets misinterpreted quite a bit, and RH gets
customers claiming OVS is misbehaving when they've oversubscribed the
system.
Mechanically, it would be pretty simple to do something like:
---
diff --git a/lib/timeval.c b/lib/timeval.c
index 193c7bab17..00e5f2a74d 100644
--- a/lib/timeval.c
+++ b/lib/timeval.c
@@ -40,6 +40,7 @@
#include "openvswitch/vlog.h"
VLOG_DEFINE_THIS_MODULE(timeval);
+COVERAGE_DEFINE(long_poll_interval);
#if !defined(HAVE_CLOCK_GETTIME)
typedef unsigned int clockid_t;
@@ -645,6 +646,8 @@ log_poll_interval(long long int last_wakeup)
struct rusage rusage;
if (!getrusage_thread(&rusage)) {
+ COVERAGE_INC(long_poll_interval);
+
VLOG_WARN("Unreasonably long %lldms poll interval"
" (%lldms user, %lldms system)",
interval,
---
This would at least expose the coverage data via the coverage framework
and it can be queried via ovs-appctl. Actually, the advantage here is
that the coverage counter can track some details about X/sec over the
last 5 seconds, minute, hour, in addition to the total, so we can see
whether the condition is ongoing.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev