On Fri, Aug 16, 2013 at 10:48:44AM -0700, Alex Wang wrote:
> This commit changes the "ovs-appclt coverage/show" command to show the
> secondly, minutely and hourly rates of function invocation. More
> importantly, this makes using coverage counter an easy way to monitor
> the execution of specific functions.
>
> Signed-off-by: Alex Wang <[email protected]>
Thanks for the patch.
Superficially, I noticed some oddities in wording. The program for
talking to a running OVS daemon is "ovs-appctl" (not ovs-appclt). But
it's really the words "secondly" and "minutely" that jumped out at me.
"Secondly" doesn't mean "per second". "Minutely" can mean "per
minute" but it usually means "precisely".
I also noticed a cast to float. There's no benefit to float over
double these days, so I changed this to use double instead.
All this is trivia. I fixed it up locally and I'm appending a revised
version.
But there's a bigger problem that I think needs to get fixed first.
The coverage module, as is, is not thread safe. I think that it will
not cause any grave problem such as a segfault or wild pointer
dereference, but every thread's use of coverage interferes with every
other thread's use. I do not want to potentially make this worse, by
adding additional sophistication and complexity to the coverage
module, before fixing it. So I've sent out a patch that make coverage
thread safe for review:
http://openvswitch.org/pipermail/dev/2013-August/030833.html
and after it or some equivalent is in, it will be time to revise and
review this patch.
Thanks,
Ben.
--8<--------------------------cut here-------------------------->8--
From: Alex Wang <[email protected]>
Date: Fri, 16 Aug 2013 15:25:15 -0700
Subject: [PATCH] coverage: Reimplement the "ovs-appctl coverage/show" command
This commit changes the "ovs-appctl coverage/show" command to show the
per-second, per-minute and per-hour rates of function invocation. More
importantly, this makes using coverage counter an easy way to monitor
the execution of specific functions.
Signed-off-by: Alex Wang <[email protected]>
Signed-off-by: Ben Pfaff <[email protected]>
Acked-by: Andy Zhou <[email protected]>
---
lib/coverage-unixctl.man | 3 +-
lib/coverage.c | 96 +++++++++++++++++++++++++++++++++++++++-------
lib/coverage.h | 22 +++++++++--
lib/timeval.c | 1 -
tests/ofproto-dpif.at | 32 ++++++++++++++++
vswitchd/ovs-vswitchd.c | 2 +
6 files changed, 136 insertions(+), 20 deletions(-)
diff --git a/lib/coverage-unixctl.man b/lib/coverage-unixctl.man
index 9718894..49e8daa 100644
--- a/lib/coverage-unixctl.man
+++ b/lib/coverage-unixctl.man
@@ -8,4 +8,5 @@ main loop takes unusually long to run.
Coverage counters are useful mainly for performance analysis and
debugging.
.IP "\fBcoverage/show\fR"
-Displays the values of all of the coverage counters.
+Displays the per-second, per-minute and per-hour rates and total count
+of all of the coverage counters.
diff --git a/lib/coverage.c b/lib/coverage.c
index f152474..3e1226a 100644
--- a/lib/coverage.c
+++ b/lib/coverage.c
@@ -1,5 +1,5 @@
/*
- * Copyright (c) 2009, 2010, 2011, 2012 Nicira, Inc.
+ * Copyright (c) 2009, 2010, 2011, 2012, 2013 Nicira, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
@@ -47,9 +47,19 @@ struct coverage_counter *coverage_counters[] = {
#define n_coverage_counters ARRAY_SIZE(coverage_counters)
#endif /* !USE_LINKER_SECTIONS */
-static unsigned int epoch;
+static long long int coverage_run_time = LLONG_MIN;
+
+/* Defines the moving average array index variables. */
+static unsigned int min_idx = 0;
+static unsigned int hr_idx = 0;
+
+/* Index counter used to compute the moving average array's index. */
+static unsigned int idx_count = 0;
static void coverage_read(struct svec *);
+static unsigned int coverage_array_sum(const unsigned int *arr,
+ const unsigned int len);
+
static void
coverage_unixctl_show(struct unixctl_conn *conn, int argc OVS_UNUSED,
@@ -170,7 +180,7 @@ coverage_log(void)
uint32_t hash = coverage_hash();
if (coverage_hit(hash)) {
VLOG_INFO("Skipping details of duplicate event coverage for "
- "hash=%08"PRIx32" in epoch %u", hash, epoch);
+ "hash=%08"PRIx32, hash);
} else {
struct svec lines;
const char *line;
@@ -189,8 +199,17 @@ coverage_log(void)
static void
coverage_read_counter(struct svec *lines, const struct coverage_counter *c)
{
- svec_add_nocopy(lines, xasprintf("%-24s %5u / %9llu",
- c->name, c->count, c->count + c->total));
+ /* The per-second rate is calculated via averaging the counts in
+ * the most recent COVERAGE_RUN_INTERVAL interval. */
+ svec_add_nocopy(lines,
+ xasprintf("%-24s %5.1f/sec %7u/min "
+ "%9u/hr total: %llu",
+ c->name,
+ (c->min[(idx_count - 1) % MIN_AVG_LEN]
+ * 1000.0 / COVERAGE_RUN_INTERVAL),
+ coverage_array_sum(c->min, MIN_AVG_LEN),
+ coverage_array_sum(c->hr, HR_AVG_LEN),
+ c->count + c->total));
}
/* Adds coverage counter information to 'lines'. */
@@ -204,8 +223,8 @@ coverage_read(struct svec *lines)
hash = coverage_hash();
n_never_hit = 0;
- svec_add_nocopy(lines, xasprintf("Event coverage (epoch %u/entire run), "
- "hash=%08"PRIx32":", epoch, hash));
+ svec_add_nocopy(lines, xasprintf("Event coverage (frequency), "
+ "hash=%08"PRIx32":", hash));
for (i = 0; i < n_coverage_counters; i++) {
struct coverage_counter *c = coverage_counters[i];
if (c->count) {
@@ -225,16 +244,65 @@ coverage_read(struct svec *lines)
svec_add_nocopy(lines, xasprintf("%zu events never hit", n_never_hit));
}
-/* Advances to the next epoch of coverage, resetting all the counters to 0. */
+/* Runs approximately every COVERAGE_RUN_INTERVAL seconds to update the
+ * coverage counters' 'min' and 'hr' array. 'min' array is for cumulating
+ * per-second counts into per-minute count. 'hr' array is for cumulating
+ * per-minute counts into hourly count. */
void
-coverage_clear(void)
+coverage_run(void)
{
+ long long int now = time_msec();
+
+ /* Initialize the coverage_run_time. */
+ if (coverage_run_time == LLONG_MIN) {
+ coverage_run_time = now;
+ }
+
+ if (now >= coverage_run_time) {
+ size_t i, j;
+ /* Computes the number of COVERAGE_RUN_INTERVAL slots, since
+ * it is possible that the actual run interval is multiple of
+ * COVERAGE_RUN_INTERVAL. */
+ int slots = (now - coverage_run_time) / COVERAGE_RUN_INTERVAL + 1;
+
+ for (i = 0; i < slots; i++) {
+ /* Updates the index variables. */
+ /* The min_idx is increased from 0 to MIN_AVG_LEN - 1. Every
+ * time the min_idx finishes a cycle (a cycle is one minute),
+ * the hr_idx is incremented by 1. */
+ min_idx = idx_count % MIN_AVG_LEN;
+ hr_idx = idx_count / MIN_AVG_LEN;
+
+ for (j = 0; j < n_coverage_counters; j++) {
+ struct coverage_counter *c = coverage_counters[j];
+ int portion;
+
+ /* The count over the time interval is evenly distributed
+ * among slots by calculating the portion. */
+ portion = c->count / slots + (i == (slots - 1) ?
+ c->count % slots : 0);
+ c->total += portion;
+ c->count = (i == (slots - 1) ? 0 : c->count);
+ c->min[min_idx] = portion;
+ c->hr[hr_idx] = min_idx == 0 ? portion
+ : (c->hr[hr_idx] + portion);
+ }
+ /* This is to guarantee that hr_idx ranges from 0 to 59. */
+ idx_count = (idx_count + 1) % (MIN_AVG_LEN * HR_AVG_LEN);
+ }
+ /* Updates the run time. */
+ coverage_run_time = now + COVERAGE_RUN_INTERVAL;
+ }
+}
+
+static unsigned int
+coverage_array_sum(const unsigned int *arr, const unsigned int len)
+{
+ unsigned int sum = 0;
size_t i;
- epoch++;
- for (i = 0; i < n_coverage_counters; i++) {
- struct coverage_counter *c = coverage_counters[i];
- c->total += c->count;
- c->count = 0;
+ for (i = 0; i < len; i++) {
+ sum += arr[i];
}
+ return sum;
}
diff --git a/lib/coverage.h b/lib/coverage.h
index 968c489..7af2cb1 100644
--- a/lib/coverage.h
+++ b/lib/coverage.h
@@ -29,11 +29,24 @@
#include "vlog.h"
+/* Makes coverage_run run every 5000 ms (5 seconds).
+ * If this value is redefined, the new value must
+ * divide 60000. */
+#define COVERAGE_RUN_INTERVAL 5000
+
+/* Defines the moving average array length. */
+#define MIN_AVG_LEN (60000/COVERAGE_RUN_INTERVAL)
+#define HR_AVG_LEN 60
+
/* A coverage counter. */
struct coverage_counter {
const char *name; /* Textual name. */
- unsigned int count; /* Count within the current epoch. */
- unsigned long long int total; /* Total count over all epochs. */
+ unsigned int count; /* Count within the current second. */
+ unsigned long long int total; /* Total count over all time. */
+
+ /* The moving average arrays. */
+ unsigned int min[MIN_AVG_LEN];
+ unsigned int hr[HR_AVG_LEN];
};
/* Defines COUNTER. There must be exactly one such definition at file scope
@@ -57,11 +70,12 @@ struct coverage_counter {
void coverage_init(void);
void coverage_log(void);
-void coverage_clear(void);
+void coverage_run(void);
/* Implementation detail. */
#define COVERAGE_DEFINE__(COUNTER) \
extern struct coverage_counter counter_##COUNTER; \
- struct coverage_counter counter_##COUNTER = { #COUNTER, 0, 0 }
+ struct coverage_counter counter_##COUNTER = { #COUNTER, 0, 0, \
+ {0}, {0} }
#endif /* coverage.h */
diff --git a/lib/timeval.c b/lib/timeval.c
index faf8e7b..2380a0a 100644
--- a/lib/timeval.c
+++ b/lib/timeval.c
@@ -304,7 +304,6 @@ time_poll(struct pollfd *pollfds, int n_pollfds, long long
int timeout_when,
if (*last_wakeup) {
log_poll_interval(*last_wakeup);
}
- coverage_clear();
start = time_msec();
blocked = false;
diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at
index b093998..fa058f7 100644
--- a/tests/ofproto-dpif.at
+++ b/tests/ofproto-dpif.at
@@ -2689,3 +2689,35 @@ AT_CHECK([tail -1 stdout], [0], [Datapath actions: 5
])
OVS_VSWITCHD_STOP
AT_CLEANUP
+
+# Unit test for appctl coverage/show command
+AT_SETUP([ofproto-dpif - coverage/show])
+OVS_VSWITCHD_START
+
+ovs-appctl time/stop
+# before the first 5 seconds, nothing can be calculated but the total count.
+AT_CHECK([ovs-appctl coverage/show | sed -n '/^unixctl_received.*/p'], [], [dnl
+unixctl_received 0.0/sec 0/min 0/hr total: 2
+])
+
+ovs-appctl time/warp 5000
+# at first 5 second time instant, should have stats.
+AT_CHECK([ovs-appctl coverage/show | sed -n '/^unixctl_received.*/p'], [], [dnl
+unixctl_received 0.6/sec 3/min 3/hr total: 4
+])
+
+for i in `seq 0 10`; do ovs-appctl time/warp 5000; done
+# advance to first 1 minute time instant.
+AT_CHECK([ovs-appctl coverage/show | sed -n '/^unixctl_received.*/p'], [], [dnl
+unixctl_received 0.2/sec 15/min 15/hr total: 16
+])
+
+ovs-appctl time/warp 60000
+# advance to next 1 minute time instant directly, should observe the per-minute
+# rate drop.
+AT_CHECK([ovs-appctl coverage/show | sed -n '/^unixctl_received.*/p'], [], [dnl
+unixctl_received 0.4/sec 2/min 17/hr total: 18
+])
+
+OVS_VSWITCHD_STOP
+AT_CLEANUP
diff --git a/vswitchd/ovs-vswitchd.c b/vswitchd/ovs-vswitchd.c
index bc45dac..a9db1fd 100644
--- a/vswitchd/ovs-vswitchd.c
+++ b/vswitchd/ovs-vswitchd.c
@@ -28,6 +28,7 @@
#include "bridge.h"
#include "command-line.h"
#include "compiler.h"
+#include "coverage.h"
#include "daemon.h"
#include "dirs.h"
#include "dpif.h"
@@ -119,6 +120,7 @@ main(int argc, char *argv[])
bridge_run_fast();
unixctl_server_run(unixctl);
netdev_run();
+ coverage_run();
signal_wait(sighup);
memory_wait();
--
1.7.10.4
_______________________________________________
dev mailing list
[email protected]
http://openvswitch.org/mailman/listinfo/dev