On Fri, Aug 16, 2013 at 10:48:44AM -0700, Alex Wang wrote: > This commit changes the "ovs-appclt coverage/show" command to show the > secondly, minutely and hourly rates of function invocation. More > importantly, this makes using coverage counter an easy way to monitor > the execution of specific functions. > > Signed-off-by: Alex Wang <al...@nicira.com>
Thanks for the patch. Superficially, I noticed some oddities in wording. The program for talking to a running OVS daemon is "ovs-appctl" (not ovs-appclt). But it's really the words "secondly" and "minutely" that jumped out at me. "Secondly" doesn't mean "per second". "Minutely" can mean "per minute" but it usually means "precisely". I also noticed a cast to float. There's no benefit to float over double these days, so I changed this to use double instead. All this is trivia. I fixed it up locally and I'm appending a revised version. But there's a bigger problem that I think needs to get fixed first. The coverage module, as is, is not thread safe. I think that it will not cause any grave problem such as a segfault or wild pointer dereference, but every thread's use of coverage interferes with every other thread's use. I do not want to potentially make this worse, by adding additional sophistication and complexity to the coverage module, before fixing it. So I've sent out a patch that make coverage thread safe for review: http://openvswitch.org/pipermail/dev/2013-August/030833.html and after it or some equivalent is in, it will be time to revise and review this patch. Thanks, Ben. --8<--------------------------cut here-------------------------->8-- From: Alex Wang <al...@nicira.com> Date: Fri, 16 Aug 2013 15:25:15 -0700 Subject: [PATCH] coverage: Reimplement the "ovs-appctl coverage/show" command This commit changes the "ovs-appctl coverage/show" command to show the per-second, per-minute and per-hour rates of function invocation. More importantly, this makes using coverage counter an easy way to monitor the execution of specific functions. Signed-off-by: Alex Wang <al...@nicira.com> Signed-off-by: Ben Pfaff <b...@nicira.com> Acked-by: Andy Zhou <az...@nicira.com> --- lib/coverage-unixctl.man | 3 +- lib/coverage.c | 96 +++++++++++++++++++++++++++++++++++++++------- lib/coverage.h | 22 +++++++++-- lib/timeval.c | 1 - tests/ofproto-dpif.at | 32 ++++++++++++++++ vswitchd/ovs-vswitchd.c | 2 + 6 files changed, 136 insertions(+), 20 deletions(-) diff --git a/lib/coverage-unixctl.man b/lib/coverage-unixctl.man index 9718894..49e8daa 100644 --- a/lib/coverage-unixctl.man +++ b/lib/coverage-unixctl.man @@ -8,4 +8,5 @@ main loop takes unusually long to run. Coverage counters are useful mainly for performance analysis and debugging. .IP "\fBcoverage/show\fR" -Displays the values of all of the coverage counters. +Displays the per-second, per-minute and per-hour rates and total count +of all of the coverage counters. diff --git a/lib/coverage.c b/lib/coverage.c index f152474..3e1226a 100644 --- a/lib/coverage.c +++ b/lib/coverage.c @@ -1,5 +1,5 @@ /* - * Copyright (c) 2009, 2010, 2011, 2012 Nicira, Inc. + * Copyright (c) 2009, 2010, 2011, 2012, 2013 Nicira, Inc. * * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. @@ -47,9 +47,19 @@ struct coverage_counter *coverage_counters[] = { #define n_coverage_counters ARRAY_SIZE(coverage_counters) #endif /* !USE_LINKER_SECTIONS */ -static unsigned int epoch; +static long long int coverage_run_time = LLONG_MIN; + +/* Defines the moving average array index variables. */ +static unsigned int min_idx = 0; +static unsigned int hr_idx = 0; + +/* Index counter used to compute the moving average array's index. */ +static unsigned int idx_count = 0; static void coverage_read(struct svec *); +static unsigned int coverage_array_sum(const unsigned int *arr, + const unsigned int len); + static void coverage_unixctl_show(struct unixctl_conn *conn, int argc OVS_UNUSED, @@ -170,7 +180,7 @@ coverage_log(void) uint32_t hash = coverage_hash(); if (coverage_hit(hash)) { VLOG_INFO("Skipping details of duplicate event coverage for " - "hash=%08"PRIx32" in epoch %u", hash, epoch); + "hash=%08"PRIx32, hash); } else { struct svec lines; const char *line; @@ -189,8 +199,17 @@ coverage_log(void) static void coverage_read_counter(struct svec *lines, const struct coverage_counter *c) { - svec_add_nocopy(lines, xasprintf("%-24s %5u / %9llu", - c->name, c->count, c->count + c->total)); + /* The per-second rate is calculated via averaging the counts in + * the most recent COVERAGE_RUN_INTERVAL interval. */ + svec_add_nocopy(lines, + xasprintf("%-24s %5.1f/sec %7u/min " + "%9u/hr total: %llu", + c->name, + (c->min[(idx_count - 1) % MIN_AVG_LEN] + * 1000.0 / COVERAGE_RUN_INTERVAL), + coverage_array_sum(c->min, MIN_AVG_LEN), + coverage_array_sum(c->hr, HR_AVG_LEN), + c->count + c->total)); } /* Adds coverage counter information to 'lines'. */ @@ -204,8 +223,8 @@ coverage_read(struct svec *lines) hash = coverage_hash(); n_never_hit = 0; - svec_add_nocopy(lines, xasprintf("Event coverage (epoch %u/entire run), " - "hash=%08"PRIx32":", epoch, hash)); + svec_add_nocopy(lines, xasprintf("Event coverage (frequency), " + "hash=%08"PRIx32":", hash)); for (i = 0; i < n_coverage_counters; i++) { struct coverage_counter *c = coverage_counters[i]; if (c->count) { @@ -225,16 +244,65 @@ coverage_read(struct svec *lines) svec_add_nocopy(lines, xasprintf("%zu events never hit", n_never_hit)); } -/* Advances to the next epoch of coverage, resetting all the counters to 0. */ +/* Runs approximately every COVERAGE_RUN_INTERVAL seconds to update the + * coverage counters' 'min' and 'hr' array. 'min' array is for cumulating + * per-second counts into per-minute count. 'hr' array is for cumulating + * per-minute counts into hourly count. */ void -coverage_clear(void) +coverage_run(void) { + long long int now = time_msec(); + + /* Initialize the coverage_run_time. */ + if (coverage_run_time == LLONG_MIN) { + coverage_run_time = now; + } + + if (now >= coverage_run_time) { + size_t i, j; + /* Computes the number of COVERAGE_RUN_INTERVAL slots, since + * it is possible that the actual run interval is multiple of + * COVERAGE_RUN_INTERVAL. */ + int slots = (now - coverage_run_time) / COVERAGE_RUN_INTERVAL + 1; + + for (i = 0; i < slots; i++) { + /* Updates the index variables. */ + /* The min_idx is increased from 0 to MIN_AVG_LEN - 1. Every + * time the min_idx finishes a cycle (a cycle is one minute), + * the hr_idx is incremented by 1. */ + min_idx = idx_count % MIN_AVG_LEN; + hr_idx = idx_count / MIN_AVG_LEN; + + for (j = 0; j < n_coverage_counters; j++) { + struct coverage_counter *c = coverage_counters[j]; + int portion; + + /* The count over the time interval is evenly distributed + * among slots by calculating the portion. */ + portion = c->count / slots + (i == (slots - 1) ? + c->count % slots : 0); + c->total += portion; + c->count = (i == (slots - 1) ? 0 : c->count); + c->min[min_idx] = portion; + c->hr[hr_idx] = min_idx == 0 ? portion + : (c->hr[hr_idx] + portion); + } + /* This is to guarantee that hr_idx ranges from 0 to 59. */ + idx_count = (idx_count + 1) % (MIN_AVG_LEN * HR_AVG_LEN); + } + /* Updates the run time. */ + coverage_run_time = now + COVERAGE_RUN_INTERVAL; + } +} + +static unsigned int +coverage_array_sum(const unsigned int *arr, const unsigned int len) +{ + unsigned int sum = 0; size_t i; - epoch++; - for (i = 0; i < n_coverage_counters; i++) { - struct coverage_counter *c = coverage_counters[i]; - c->total += c->count; - c->count = 0; + for (i = 0; i < len; i++) { + sum += arr[i]; } + return sum; } diff --git a/lib/coverage.h b/lib/coverage.h index 968c489..7af2cb1 100644 --- a/lib/coverage.h +++ b/lib/coverage.h @@ -29,11 +29,24 @@ #include "vlog.h" +/* Makes coverage_run run every 5000 ms (5 seconds). + * If this value is redefined, the new value must + * divide 60000. */ +#define COVERAGE_RUN_INTERVAL 5000 + +/* Defines the moving average array length. */ +#define MIN_AVG_LEN (60000/COVERAGE_RUN_INTERVAL) +#define HR_AVG_LEN 60 + /* A coverage counter. */ struct coverage_counter { const char *name; /* Textual name. */ - unsigned int count; /* Count within the current epoch. */ - unsigned long long int total; /* Total count over all epochs. */ + unsigned int count; /* Count within the current second. */ + unsigned long long int total; /* Total count over all time. */ + + /* The moving average arrays. */ + unsigned int min[MIN_AVG_LEN]; + unsigned int hr[HR_AVG_LEN]; }; /* Defines COUNTER. There must be exactly one such definition at file scope @@ -57,11 +70,12 @@ struct coverage_counter { void coverage_init(void); void coverage_log(void); -void coverage_clear(void); +void coverage_run(void); /* Implementation detail. */ #define COVERAGE_DEFINE__(COUNTER) \ extern struct coverage_counter counter_##COUNTER; \ - struct coverage_counter counter_##COUNTER = { #COUNTER, 0, 0 } + struct coverage_counter counter_##COUNTER = { #COUNTER, 0, 0, \ + {0}, {0} } #endif /* coverage.h */ diff --git a/lib/timeval.c b/lib/timeval.c index faf8e7b..2380a0a 100644 --- a/lib/timeval.c +++ b/lib/timeval.c @@ -304,7 +304,6 @@ time_poll(struct pollfd *pollfds, int n_pollfds, long long int timeout_when, if (*last_wakeup) { log_poll_interval(*last_wakeup); } - coverage_clear(); start = time_msec(); blocked = false; diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at index b093998..fa058f7 100644 --- a/tests/ofproto-dpif.at +++ b/tests/ofproto-dpif.at @@ -2689,3 +2689,35 @@ AT_CHECK([tail -1 stdout], [0], [Datapath actions: 5 ]) OVS_VSWITCHD_STOP AT_CLEANUP + +# Unit test for appctl coverage/show command +AT_SETUP([ofproto-dpif - coverage/show]) +OVS_VSWITCHD_START + +ovs-appctl time/stop +# before the first 5 seconds, nothing can be calculated but the total count. +AT_CHECK([ovs-appctl coverage/show | sed -n '/^unixctl_received.*/p'], [], [dnl +unixctl_received 0.0/sec 0/min 0/hr total: 2 +]) + +ovs-appctl time/warp 5000 +# at first 5 second time instant, should have stats. +AT_CHECK([ovs-appctl coverage/show | sed -n '/^unixctl_received.*/p'], [], [dnl +unixctl_received 0.6/sec 3/min 3/hr total: 4 +]) + +for i in `seq 0 10`; do ovs-appctl time/warp 5000; done +# advance to first 1 minute time instant. +AT_CHECK([ovs-appctl coverage/show | sed -n '/^unixctl_received.*/p'], [], [dnl +unixctl_received 0.2/sec 15/min 15/hr total: 16 +]) + +ovs-appctl time/warp 60000 +# advance to next 1 minute time instant directly, should observe the per-minute +# rate drop. +AT_CHECK([ovs-appctl coverage/show | sed -n '/^unixctl_received.*/p'], [], [dnl +unixctl_received 0.4/sec 2/min 17/hr total: 18 +]) + +OVS_VSWITCHD_STOP +AT_CLEANUP diff --git a/vswitchd/ovs-vswitchd.c b/vswitchd/ovs-vswitchd.c index bc45dac..a9db1fd 100644 --- a/vswitchd/ovs-vswitchd.c +++ b/vswitchd/ovs-vswitchd.c @@ -28,6 +28,7 @@ #include "bridge.h" #include "command-line.h" #include "compiler.h" +#include "coverage.h" #include "daemon.h" #include "dirs.h" #include "dpif.h" @@ -119,6 +120,7 @@ main(int argc, char *argv[]) bridge_run_fast(); unixctl_server_run(unixctl); netdev_run(); + coverage_run(); signal_wait(sighup); memory_wait(); -- 1.7.10.4 _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev