On 10 Jan 2024, at 20:34, Ilya Maximets wrote:

> On 1/10/24 12:25, Eelco Chaudron wrote:
>> Martin Kennelly observes that even though this data is available to
>> humans via the journal/log files, these aren't exactly easy for a
>> developer to make any kind of behavioral inferences.  This kind of
>> log and counter would be useful when checking on system health to
>> let us know that an Open vSwitch component is noticing some kind of
>> system level hiccup.
>>
>> Add a new coverage counter to track information on these events, and
>> let a developer or system engineer know how long these events have
>> occurred with some historical context.
>>
>> Reported-at: 
>> https://lists.linuxfoundation.org/pipermail/ovs-discuss/2023-June/052523.html
>> Suggested-by: Martin Kennelly <[email protected]>
>> Co-Authored-By: Aaron Conole <[email protected]>
>> Signed-off-by: Aaron Conole <[email protected]>
>> Signed-off-by: Eelco Chaudron <[email protected]>
>> Acked-by: Simon Horman <[email protected]>
>> ---
>>  v2: Updated commit message, based on similar patch upstream.
>>
>>  lib/timeval.c | 4 ++++
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/lib/timeval.c b/lib/timeval.c
>> index 193c7bab1..0abe7e555 100644
>> --- a/lib/timeval.c
>> +++ b/lib/timeval.c
>> @@ -41,6 +41,8 @@
>>
>>  VLOG_DEFINE_THIS_MODULE(timeval);
>>
>> +COVERAGE_DEFINE(long_poll_interval);
>> +
>>  #if !defined(HAVE_CLOCK_GETTIME)
>>  typedef unsigned int clockid_t;
>>  static int clock_gettime(clock_t id, struct timespec *ts);
>> @@ -644,6 +646,8 @@ log_poll_interval(long long int last_wakeup)
>>          const struct rusage *last_rusage = get_recent_rusage();
>>          struct rusage rusage;
>>
>> +        COVERAGE_INC(long_poll_interval);
>> +
>>          if (!getrusage_thread(&rusage)) {
>>              VLOG_WARN("Unreasonably long %lldms poll interval"
>>                        " (%lldms user, %lldms system)",
>
> Potentially interesting side effect of this change would be that
> coverage counters will likely be logged on every long poll interval,
> since every long poll interval will change them.  Before this change
> many coverage dumps are skipped, because they are the same (see the
> coverage_log() function).  Did you notice such a behavior change?

I did not notice this, as I only forced a single instance, but this is a good 
catch. I could exclude this new counter from the coverage_hash() function. 
However, there will probably be a lot of malloc ones anyway. What do you think?

//Eelco

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to