Re: [ovs-dev] [PATCH v4] utilities: Add revalidator measurement script and needed USDT probes.

Eelco Chaudron Fri, 23 Dec 2022 01:50:16 -0800


On 22 Dec 2022, at 18:42, Adrian Moreno wrote:


> Hi Eelco,
>
> I am not done reviewing this patch. I'm still tyring to figure why I can't 
> make the pahole thing work so I'll continue tomorrow. For now, see some 
> questions below.

Are you sure you are on the next branch, not on main, that was my issue earlier 
:)

> On 12/22/22 09:55, Eelco Chaudron wrote:
>> This patch adds a Python script that can be used to analyze the
>> revalidator runs by providing statistics (including some real time
>> graphs).
>>
>> The USDT events can also be captured to a file and used for
>> later offline analysis.
>>
>> The following blog explains the Open vSwitch revalidator
>> implementation and how this tool can help you understand what is
>> happening in your system.
>>
>> https://developers.redhat.com/articles/2022/10/19/open-vswitch-revalidator-process-explained
>>
>> Signed-off-by: Eelco Chaudron <[email protected]>
>> ---
>>
>> v2: Added note that script only works a with single datapath configured.
>> v3: Updated patch to use pahole to get OVS structures dynamically from
>>      debug data.
>> v4: Added script to usdt_SCRIPTS
>>
>>   Documentation/topics/usdt-probes.rst    |   84 +++
>>   ofproto/ofproto-dpif-upcall.c           |   11
>>   utilities/automake.mk                   |    4
>>   utilities/usdt-scripts/reval_monitor.py |  858 
>> +++++++++++++++++++++++++++++++
>>   4 files changed, 956 insertions(+), 1 deletion(-)
>>   create mode 100755 utilities/usdt-scripts/reval_monitor.py
>>
>> diff --git a/Documentation/topics/usdt-probes.rst 
>> b/Documentation/topics/usdt-probes.rst
>> index 7ce19aaed..bc250e723 100644
>> --- a/Documentation/topics/usdt-probes.rst
>> +++ b/Documentation/topics/usdt-probes.rst
>> @@ -214,6 +214,10 @@ Available probes in ``ovs_vswitchd``:
>>   - dpif_recv:recv_upcall
>>   - main:poll_block
>>   - main:run_start
>> +- revalidate_ukey\_\_:entry
>> +- revalidate_ukey\_\_:exit
>> +- udpif_revalidator:start_dump
>> +- udpif_revalidator:sweep_done
>>      dpif_netlink_operate\_\_:op_flow_del
>> @@ -327,6 +331,7 @@ probe main:run_start
>>   ~~~~~~~~~~~~~~~~~~~~
>>    **Description**:
>> +
>>   The ovs-vswitchd's main process contains a loop that runs every time some 
>> work
>>   needs to be done. This probe gets triggered every time the loop starts 
>> from the
>>   beginning. See also the ``main:poll_block`` probe below.
>> @@ -344,6 +349,7 @@ probe main:poll_block
>>   ~~~~~~~~~~~~~~~~~~~~~
>>    **Description**:
>> +
>>   The ovs-vswitchd's main process contains a loop that runs every time some 
>> work
>>   needs to be done. This probe gets triggered every time the loop is done, 
>> and
>>   it's about to wait for being re-started by a poll_block() call returning.
>> @@ -358,6 +364,84 @@ See also the ``main:run_start`` probe above.
>>   - ``utilities/usdt-scripts/bridge_loop.bt``
>>    +revalidate_ukey\_\_:entry
>> +~~~~~~~~~~~~~~~~~~~~~~~~~
>> +
>> +**Description**:
>> +
>> +This probe gets triggered on entry of the revalidate_ukey__() function.
>> +
>
> Why not trace the begining and end of revalidate_ukey so we can also detect 
> when a flow is evicted because should_revalidate() returns false?

Because the need for this trace event is to capture the actual revalidate of a 
ukey, not the potential one. If we need this we can add a more specific one, 
but I think Kevin’s probe will do this.


> Also, what are your plans wrt Kevin's patch [1]. Is this one supposed to 
> superceed it? It introduces an enum to indicate the reason why a flow gets 
> evicted, which I think is a good idea. Should we modify the USDT probe 
> accordingly when the reason is added?

I have not heard from Kevin since my comments on v6, I hope he will send a v7 
soon based on feedback and the pahole solution being available.

Kevin’s patch is related to individual ukey revalidation, this patch is more 
related to the overall revalidator characteristics. So I see no need to merge 
them into one, however, it would be nice for Kevin to send a new revision.

>
> [1] 
> https://patchwork.ozlabs.org/project/openvswitch/patch/[email protected]/
>
>> +**Arguments**:
>> +
>> +- *arg0*: ``(struct udpif *) udpif``
>> +- *arg1*: ``(struct udpif_key *) ukey``
>> +- *arg2*: ``(uint16_t) tcp_flags``
>> +- *arg3*: ``(struct ofpbuf *) odp_actions``
>> +- *arg4*: ``(struct recirc_refs *) recircs``
>> +- *arg5*: ``(struct xlate_cache *) xcache``
>> +
>> +**Script references**:
>> +
>> +- ``utilities/usdt-scripts/reval_monitor.py``
>> +
>> +
>> +revalidate_ukey\_\_:exit
>> +~~~~~~~~~~~~~~~~~~~~~~~~
>> +
>> +**Description**:
>> +
>> +This probe gets triggered right before the revalidate_ukey__() function 
>> exits.
>> +
>> +**Arguments**:
>> +
>> +- *arg0*: ``(struct udpif *) udpif``
>> +- *arg1*: ``(struct udpif_key *) ukey``
>> +- *arg2*: ``(enum reval_result) result``
>> +
>> +**Script references**:
>> +
>> +*None*
>> +
>> +
>> +udpif_revalidator:start_dump
>> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> +
>> +**Description**:
>> +
>> +The ovs-vswitchd's revalidator process contains a loop that runs every time
>> +revalidation work is needed. This probe gets triggered every time the
>> +dump phase has started.
>> +
>> +**Arguments**:
>> +
>> +- *arg0*: ``(struct udpif *) udpif``
>> +- *arg1*: ``(size_t) n_flows``
>> +
>> +**Script references**:
>> +
>> +- ``utilities/usdt-scripts/reval_monitor.py``
>> +
>> +
>> +udpif_revalidator:sweep_done
>> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> +
>> +**Description**:
>> +
>> +The ovs-vswitchd's revalidator process contains a loop that runs every time
>> +revalidation work is needed. This probe gets triggered every time the
>> +sweep phase was completed.
>> +
>> +**Arguments**:
>> +
>> +- *arg0*: ``(struct udpif *) udpif``
>> +- *arg1*: ``(size_t) n_flows``
>> +- *arg2*: ``(unsigned) MIN(ofproto_max_idle, ofproto_max_revalidator)``
>> +
>> +**Script references**:
>> +
>> +- ``utilities/usdt-scripts/reval_monitor.py``
>> +
>> +
>>   Adding your own probes
>>   ----------------------
>>  diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
>> index ad9635496..d078ccc87 100644
>> --- a/ofproto/ofproto-dpif-upcall.c
>> +++ b/ofproto/ofproto-dpif-upcall.c
>> @@ -42,6 +42,7 @@
>>   #include "seq.h"
>>   #include "tunnel.h"
>>   #include "unixctl.h"
>> +#include "openvswitch/usdt-probes.h"
>>   #include "openvswitch/vlog.h"
>>   #include "lib/netdev-provider.h"
>>  @@ -978,6 +979,7 @@ udpif_revalidator(void *arg)
>>                   terse_dump = udpif_use_ufid(udpif);
>>                   udpif->dump = dpif_flow_dump_create(udpif->dpif, 
>> terse_dump,
>>                                                       NULL);
>> +                OVS_USDT_PROBE(udpif_revalidator, start_dump, udpif, 
>> n_flows);
>>               }
>>           }
>>  @@ -1029,6 +1031,9 @@ udpif_revalidator(void *arg)
>>                             duration);
>>               }
>>  +            OVS_USDT_PROBE(udpif_revalidator, sweep_done, udpif, n_flows,
>> +                           MIN(ofproto_max_idle, ofproto_max_revalidator));
>> +
>>               poll_timer_wait_until(start_time + MIN(ofproto_max_idle,
>>                                                      
>> ofproto_max_revalidator));
>>               seq_wait(udpif->reval_seq, last_reval_seq);
>> @@ -2235,6 +2240,9 @@ revalidate_ukey__(struct udpif *udpif, const struct 
>> udpif_key *ukey,
>>           .wc = &wc,
>>       };
>>  +    OVS_USDT_PROBE(revalidate_ukey__, entry, udpif, ukey, tcp_flags,
>> +                   odp_actions, recircs, xcache);
>> +
>>       result = UKEY_DELETE;
>>       xoutp = NULL;
>>       netflow = NULL;
>> @@ -2298,6 +2306,9 @@ exit:
>>           netflow_flow_clear(netflow, &ctx.flow);
>>       }
>>       xlate_out_uninit(xoutp);
>> +
>> +    OVS_USDT_PROBE(revalidate_ukey__, exit, udpif, ukey, result);
>> +
>>       return result;
>>   }
>>  diff --git a/utilities/automake.mk b/utilities/automake.mk
>> index 132a16942..9c22ee94d 100644
>> --- a/utilities/automake.mk
>> +++ b/utilities/automake.mk
>> @@ -23,6 +23,7 @@ scripts_DATA += utilities/ovs-lib
>>   usdt_SCRIPTS += \
>>      utilities/usdt-scripts/bridge_loop.bt \
>>      utilities/usdt-scripts/upcall_cost.py \
>> +    utilities/usdt-scripts/reval_monitor.py \
>>      utilities/usdt-scripts/upcall_monitor.py
>>    completion_SCRIPTS += \
>> @@ -68,7 +69,8 @@ EXTRA_DIST += \
>>      utilities/docker/debian/build-kernel-modules.sh \
>>      utilities/usdt-scripts/bridge_loop.bt \
>>      utilities/usdt-scripts/upcall_cost.py \
>> -    utilities/usdt-scripts/upcall_monitor.py
>> +    utilities/usdt-scripts/upcall_monitor.py \
>> +    utilities/usdt-scripts/reval_monitor.py
>>   MAN_ROOTS += \
>>      utilities/ovs-testcontroller.8.in \
>>      utilities/ovs-dpctl.8.in \
>> diff --git a/utilities/usdt-scripts/reval_monitor.py 
>> b/utilities/usdt-scripts/reval_monitor.py
>> new file mode 100755
>> index 000000000..148f87320
>> --- /dev/null
>> +++ b/utilities/usdt-scripts/reval_monitor.py
>> @@ -0,0 +1,858 @@
>> +#!/usr/bin/env python3
>> +#
>> +# Copyright (c) 2021 Red Hat, Inc.
>> +#
>> +# Licensed under the Apache License, Version 2.0 (the "License");
>> +# you may not use this file except in compliance with the License.
>> +# You may obtain a copy of the License at:
>> +#
>> +#     http://www.apache.org/licenses/LICENSE-2.0
>> +#
>> +# Unless required by applicable law or agreed to in writing, software
>> +# distributed under the License is distributed on an "AS IS" BASIS,
>> +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
>> +# See the License for the specific language governing permissions and
>> +# limitations under the License.
>> +#
>> +# Script information:
>> +# -------------------
>> +# reval_monitor.py uses various user-defined tracepoints to get all the
>> +# revalidator-process related variables and will display them in a (dynamic)
>> +# graph. In addition, it will also dump the data to the console
>> +# in a CSV format. Note that all the graphical output can be disabled.
>> +#
>> +# All the USDT events can be saved to a file and than can be used to
>> +# replay the trace offline and still get the plots.
>> +#
>> +# The script can simple be invoked without any options, and it will try
>> +# to find the running ovs-vswitchd instance:
>> +#
>> +#   # ./reval_monitor.py
>> +#   # Starting trace @2022-09-20T04:07:43.588749 (08:07:43 UTC)
>> +#   ts_start, ts_complete, n_flows, n_reval_flows, avg_n_flows, max_n_flows,
>> +#     flow_limit, dump_duration, poll_wait, actual_wait
>> +#   1741367714251645, 1741367714532545, 0, 0, 0, 10000, 69000, 1, 500, 
>> 500.52
>> +#   1741368215056961, 1741368215318223, 0, 0, 0, 10000, 69000, 1, 500, 
>> 500.55
>> +#   1741368715865871, 1741368716107089, 0, 0, 0, 10000, 69000, 1, 500, 
>> 499.48
>> +#   ^C# Stopping trace @2022-09-20T04:07:49.893827 (08:07:49 UTC)
>> +#
>> +#  IMPORTANT NOTE: This script only works when a single datapath is 
>> configured!
>> +#
>> +#  The following are the available options:
>> +#
>> +#    usage: reval_monitor.py [-h] [-c] [--buffer-page-count NUMBER]
>> +#                            [-D [DEBUG]] [-g] [--no-ukey-count]
>> +#                            [-p VSWITCHD_PID] [-P PAHOLE] [-r FILE] [-R]
>> +#                            [-u SECONDS] [-w FILE] [-W FILE]
>> +#
>> +#    options:
>> +#      -h, --help            show this help message and exit
>> +#      -c, --compress-output
>> +#                            Compress output, i.e. only dump changes in
>> +#                            the dataset
>> +#      --buffer-page-count NUMBER
>> +#                            Number of BPF ring buffer pages, default
>> +#                            1024
>> +#      -D [DEBUG], --debug [DEBUG]
>> +#                            Enable eBPF debugging
>> +#      -g, --no-gui          Do use the gui to display plots
>> +#      --no-ukey-count       No revalidate_ukey() counting
>> +#      -p VSWITCHD_PID, --pid VSWITCHD_PID
>> +#                            ovs-vswitch's PID
>> +#      -P PAHOLE, --pahole PAHOLE
>> +#                            Pahole executable to use, default pahole
>> +#      -r FILE, --read-events FILE
>> +#                            Read events from <FILE> instead of
>> +#                            installing tracepoints
>> +#      -R, --no-realtime-plots
>> +#                            Do not show realtime plot while tracing
>> +#      -u SECONDS, --update-interval SECONDS
>> +#                            Seconds to wait between real time update,
>> +#                            default 1
>> +#      -w FILE, --write-events FILE
>> +#                            Write events to <FILE>
>> +#      -W FILE, --write-charts FILE
>> +#                            Write overall charts to <FILE>.png
>> +
>> +#                            [-D [DEBUG]] [-g] [--no-ukey-count]
>> +#                            [-p VSWITCHD_PID] [-r FILE] [-R]
>> +#                            [-u SECONDS] [-w FILE] [-W FILE]
>> +#
>> +# The -g option disabled all GUI output of matplotlib, -R only disables the
>> +# real-time plots. As real-time plots are rather slow, the -u option can be
>> +# used to only update the graph every x seconds, which might speed up the
>> +# processing.
>> +#
>> +# The --no-ukey-count option disables counting of the number of flows 
>> actually
>> +# being revalidated against the current OpenFlow ruleset. This will not 
>> install
>> +# the specific tracepoint which would be called for each flow being
>> +# revalidated.
>> +#
>> +# What is plotted in the graphs (and dumped in the CSV output)?
>> +# - n_flows:       Number of flows active in the system.
>> +# - n_reval_flows: Number of flows that where revalidated against the 
>> OpenFlow
>> +#                  ruleset.
>> +# - dump_duration: Time it took to dump and process all flows.
>> +# - avg_n_flows:   Average number of flows in the system.
>> +# - max_n_flows:   Maximum number of flows in the system.
>> +# - flow_limit:    Dynamic flow limit.
>> +# - poll_wait:     Time requested for the poll wait.
>> +# - actual_wait:   Time it took to be woken up.
>> +#
>> +# Dependencies:
>> +#  This script needs the 'readelf' binary to be available. In addition, it 
>> also
>> +#  needs pahole to be installed, and it needs a version that is equal or 
>> newer
>> +#  than the following commit on the next branch:
>> +#
>> +#    https://git.kernel.org/pub/scm/devel/pahole/pahole.git/?h=next
>> +#      c55b13b9d785 ("WIP: Remove DW_TAG_atomic_type when encoding BTF")
>> +#
>
> Should we give the users a default set of headers they can use if they cannot 
> install such a version of pahole?

I guess this is what we wanted to avoid, i.e. no static headers defined 
anywhere. In addition, I also do not think we should add this add compile time 
as it adds pahole as a build dependency.
I would assume if people need to use this script to get technical instead, they 
should be able to figure out how to get this from pahole on a development setup.

>> +#  To use a locally compiled pahole the --pahole option can be used.
>> +#  For example:
>> +#    # ./reval_monitor.py --pahole ~/pahole/build/pahole -g
>> +#    Starting trace @2022-12-20T14:57:26.077815 (13:57:26 UTC)
>> +#    ts_start, ts_complete, n_flows, n_reval_flows, avg_n_flows, 
>> max_n_flows, \
>> +#      flow_limit, dump_duration, poll_wait, actual_wait
>> +#    4202771850983494, 4202771851472838, 0, 0, 0, 0, 10000, 1, 500, 15.06
>> +#    4202771866531996, 4202771867713366, 0, 0, 0, 0, 10000, 1, 500, 4.23
>> +#    4202771871941979, 4202771872749915, 0, 0, 0, 0, 10000, 1, 500, 500.02
>> +#    4202772372770361, 4202772373531820, 0, 0, 0, 0, 10000, 1, 500, 499.96
>> +#    4202772873487942, 4202772874514753, 0, 0, 0, 0, 10000, 1, 500, 500.01
>> +#    4202773374528435, 4202773375695054, 0, 0, 0, 0, 10000, 1, 500, 500.01
>> +#    4202773875701559, 4202773876880763, 0, 0, 0, 0, 10000, 1, 500, 500.04
>> +#    4202774376925058, 4202774377905799, 0, 0, 0, 0, 10000, 1, 500, 500.03
>> +#    ^C# Stopping trace @2022-12-20T14:57:40.391730 (13:57:40 UTC)
>> +#
>> +
>> +try:
>> +    from bcc import BPF, USDT, USDTException
>> +except ModuleNotFoundError:
>> +    print("WARNING: Can't find the BPF Compiler Collection (BCC) tools!")
>> +    print("         This is NOT problem if you analyzing previously 
>> collected"
>> +          " data.\n")
>> +
>> +from collections import namedtuple
>> +
>> +import argparse
>> +import ast
>> +import datetime
>> +import re
>> +import subprocess
>> +import sys
>> +import time
>> +
>> +import pytz
>> +import psutil
>> +import matplotlib.pyplot as plt
>> +
>> +#
>> +# Actual eBPF source code
>> +#
>> +EBPF_SOURCE = """
>> +#include <linux/sched.h>
>> +
>> +<OVS_INCLUDE_DEFINITIONS>
>> +
>> +struct event_t {
>> +    u64  ts;
>> +    u32  pid;
>> +    u32  id;
>> +    u64  n_flows;
>> +    u32  avg_n_flows;
>> +    u32  max_n_flows;
>> +    u32  flow_limit;
>> +    u32  dump_duration;
>> +    u32  poll_wait;
>> +};
>> +
>> +
>> +BPF_RINGBUF_OUTPUT(events, <BUFFER_PAGE_CNT>);
>> +BPF_TABLE("percpu_array", uint32_t, uint64_t, dropcnt, 1);
>> +
>> +static struct event_t *get_event(uint32_t id) {
>> +    struct event_t *event = events.ringbuf_reserve(sizeof(struct event_t));
>> +
>> +    if (!event) {
>> +        uint32_t type = 0;
>> +        uint64_t *value = dropcnt.lookup(&type);
>> +        if (value)
>> +            __sync_fetch_and_add(value, 1);
>> +
>> +        return NULL;
>> +    }
>> +
>> +    event->id = id;
>> +    event->ts = bpf_ktime_get_ns();
>> +    event->pid = bpf_get_current_pid_tgid();
>> +
>> +    return event;
>> +}
>> +
>> +int probe__start_dump(struct pt_regs *ctx) {
>> +    struct event_t *event = get_event(0);
>> +    if (!event)
>> +        return 1;
>> +
>> +    events.ringbuf_submit(event, 0);
>> +    return 0;
>> +};
>> +
>> +int probe__sweep_done(struct pt_regs *ctx) {
>> +    struct udpif udpif;
>> +
>> +    bpf_usdt_readarg_p(1, ctx, &udpif, sizeof(udpif));
>> +
>> +    struct event_t *event = get_event(1);
>> +    if (!event)
>> +        return 1;
>> +
>> +    event->avg_n_flows = udpif.avg_n_flows;
>> +    event->max_n_flows = udpif.max_n_flows;
>> +    event->flow_limit = udpif.flow_limit;
>> +    event->dump_duration = udpif.dump_duration;
>> +
>> +    bpf_usdt_readarg(2, ctx, &event->n_flows);
>> +    bpf_usdt_readarg(3, ctx, &event->poll_wait);
>> +
>> +    events.ringbuf_submit(event, 0);
>> +    return 0;
>> +};
>> +
>> +int probe__reval_entry(struct pt_regs *ctx) {
>> +    struct event_t *event = get_event(2);
>> +    if (!event)
>> +        return 1;
>> +
>> +    events.ringbuf_submit(event, 0);
>> +    return 0;
>> +};
>> +"""
>> +
>> +
>> +#
>> +# event_to_dict()
>> +#
>> +def event_to_dict(event):
>> +    event_dict = {}
>> +
>> +    for field, _ in event._fields_:
>> +        if isinstance(getattr(event, field), (int, bytes)):
>> +            event_dict[field] = getattr(event, field)
>> +
>> +    return event_dict
>> +
>> +
>> +#
>> +# print_csv_header()
>> +#
>> +def print_csv_header():
>> +    print("ts_start, ts_complete, n_flows, n_reval_flows, avg_n_flows, "
>> +          "max_n_flows, flow_limit, dump_duration, poll_wait, actual_wait")
>> +
>> +
>> +#
>> +# process_event()
>> +#
>> +def process_event(ctx, data, size):
>> +    event = b['events'].event(data)
>> +    _process_event(event)
>> +
>> +
>> +def _process_event(event):
>> +    global graph
>> +
>> +    if export_file is not None:
>> +        export_file.write("event = {}\n".format(event_to_dict(event)))
>> +
>> +    if event.id == 0 and not state['running']:
>> +        start = state["last_start"]
>> +        done = state["last_done"]
>> +        if done is not None and start is not None:
>> +            actual_wait = (event.ts - done.ts) / 1000000
>> +            csv = "{}, {}, {}, {}, {}, {}, {}, {}, {}, {:.2f}".format(
>> +                start.ts, done.ts, done.n_flows, graph.ukey_count,
>> +                done.avg_n_flows, done.max_n_flows, done.flow_limit,
>> +                done.dump_duration, done.poll_wait, actual_wait)
>> +
>> +            if graph.base_time == 0:
>> +                graph = graph._replace(base_time=done.ts)
>> +
>> +            graph.time.append((done.ts - graph.base_time) / 1000000000)
>> +            graph.n_flows.append(done.n_flows)
>> +            graph.n_reval_flows.append(graph.ukey_count)
>> +            graph.avg_n_flows.append(done.avg_n_flows)
>> +            graph.max_n_flows.append(done.max_n_flows)
>> +            graph.flow_limit.append(done.flow_limit)
>> +            graph.dump_duration.append(done.dump_duration)
>> +            graph.poll_wait.append(done.poll_wait)
>> +            graph.actual_wait.append(actual_wait)
>> +
>> +            if not options.no_gui and not options.no_realtime_plots:
>> +                updated_graph = dynamic_plot_update(
>> +                    graph, refresh=options.update_interval)
>> +                if updated_graph is None:
>> +                    raise KeyboardInterrupt
>> +                graph = updated_graph
>> +
>> +            if options.compress_output:
>> +                last_csv = state["last_csv"]
>> +                if last_csv is None or \
>> +                   csv.split(",")[2:-1] != last_csv.split(",")[2:-1] or \
>> +                   abs((event.ts - done.ts) / 1000000 - done.poll_wait) > 
>> 100:
>> +                    print(csv)
>> +                else:
>> +                    state["last_not_printed_csv"] = csv
>> +
>> +                state["last_csv"] = csv
>> +            else:
>> +                print(csv)
>> +
>> +        state["last_start"] = event
>> +        state['running'] = True
>> +        graph = graph._replace(ukey_count=0)
>> +    elif event.id == 1 and state['running']:
>> +        state["last_done"] = event
>> +        state['running'] = False
>> +    elif event.id == 2 and state['running']:
>> +        graph = graph._replace(ukey_count=graph.ukey_count + 1)
>> +
>> +
>> +#
>> +# run_program()
>> +#
>> +def run_program(command, need_result=True):
>> +    try:
>> +        process = subprocess.run(command,
>> +                                 stdout=subprocess.PIPE,
>> +                                 stderr=subprocess.STDOUT,
>> +                                 encoding='utf8',
>> +                                 check=True)
>> +
>> +    except subprocess.CalledProcessError as perror:
>> +        if need_result:
>> +            return perror.returncode, perror.stdout
>> +
>> +        return perror.returncode
>> +
>> +    if need_result:
>> +        return 0, process.stdout
>> +
>> +    return 0
>> +
>> +
>> +#
>> +# get_ovs_definitions()
>> +#
>> +def get_ovs_definitions(objects, pahole="pahole", pid="self"):
>> +    if not isinstance(objects, list):
>> +        objects = [objects]
>> +
>> +    if len(objects) == 0:
>> +        raise ValueError("Must supply at least one object!")
>> +
>> +    vswitchd = "/proc/{}/exe".format(str(pid))
>> +
>> +    object_str = ""
>> +    for obj in objects:
>> +        object_str += obj + ','
>> +
>> +    object_str = object_str.rstrip(',')
>> +
>> +    def run_pahole(debug_file):
>> +        error, result = run_program([pahole, "-C", object_str, "--compile",
>> +                                     debug_file])
>> +
>> +        if error:
>> +            if "pahole: " + debug_file + ": Invalid argument" not in result:
>> +                print("ERROR: Pahole failed to get ovs-vswitchd data "
>> +                      "structures!\n{}".format(re.sub('^', ' ' * 7,
>> +                                                      result.rstrip(),
>> +                                                      flags=re.MULTILINE)))
>> +                sys.exit(-1)
>> +
>> +            return None
>> +
>> +        if bool(re.search("pahole: type .* not found", result)):
>> +            return None
>> +
>> +        return result
>> +
>> +    def run_readelf(debug_file):
>> +        error, result = run_program(['readelf', "-n",
>> +                                     "--debug-dump=links", debug_file])
>> +
>> +        if error:
>> +            print("ERROR: Failed 'readelf' on \"{}\"!\n{}".
>> +                  format(debug_file, re.sub('^', ' ' * 7, result,
>> +                                            flags=re.MULTILINE)))
>> +            sys.exit(-1)
>> +
>> +        return result
>> +
>> +    #
>> +    # First try to find the debug data as part of the executable
>> +    #
>> +    result = run_pahole(vswitchd)
>> +
>> +    if result is None:
>> +        print("INFO: Failed to find debug info in \"{}\"!".format(vswitchd))
>> +
>> +        #
>> +        # Get additional debug and dwz information if available
>> +        #
>> +
>> +        elf_result = run_readelf(vswitchd)
>> +        match = re.search("Build ID: ([0-9a-fA-F]+)", elf_result)
>> +        if match is None:
>> +            print("ERROR: Can't find build ID to read debug symbols!")
>> +            sys.exit(-1)
>> +
>> +        dbg_file = "/usr/lib/debug/.build-id/{}/{}.debug".format(
>> +            match.group(1)[:2], match.group(1)[2:])
>> +
>> +        #
>> +        # First try to find data in default debug image
>> +        #
>> +        result = run_pahole(dbg_file)
>> +
>> +    if result is None:
>> +        print("ERROR: Failed to find needed data structures through 
>> pahole!")
>> +        sys.exit(-1)
>> +
>> +    #
>> +    # We need an empty _Atomic definition to avoid compiler complaints.
>> +    #
>> +    result = "#define _Atomic\n" + result
>> +
>> +    #
>> +    # Remove the uint64_t definition as it conflicts with the kernel one.
>> +    #
>> +    result = re.sub("^typedef.*uint64_t;$", "", result, flags=re.MULTILINE)
>> +
>> +    return result
>> +
>> +
>> +#
>> +# next_power_of_two()
>> +#
>> +def next_power_of_two(val):
>> +    np = 1
>> +    while np < val:
>> +        np *= 2
>> +    return np
>> +
>> +
>> +#
>> +# dynamic_plot_init()
>> +#
>> +def dynamic_plot_init(real_time=True):
>> +
>> +    if real_time:
>> +        lines = []
>> +        fig, axs = plt.subplots(4, figsize=(19, 10))
>> +        fig.suptitle('Revalidator Handling')
>> +        for ax in axs:
>> +            ax.grid()
>> +
>> +        axs[0].set_ylabel("Numer of flows", weight='bold')
>> +        axs[1].set_ylabel("Time spend (ms)", weight='bold')
>> +        axs[2].set_ylabel("Numer of flows", weight='bold')
>> +        axs[3].set_ylabel("Time spend (ms)", weight='bold')
>> +        axs[3].set_xlabel("Time (seconds since start)", weight='bold')
>> +
>> +        lines.append(axs[0].plot([], [], label="n_flows", marker='o')[0])
>> +        lines.append(axs[0].plot([], [], label="n_reval_flows")[0])
>> +        axs[0].legend(bbox_to_anchor=(1, 1), loc='upper left',
>> +                      borderaxespad=0.5)
>> +        axs[0].set_xlim(0, 30)
>> +        axs[0].set_ylim(-4, 104)
>> +
>> +        lines.append(axs[1].plot([], [], color="orange",
>> +                                 label="dump_duration")[0])
>> +        axs[1].legend(bbox_to_anchor=(1, 1),
>> +                      loc='upper left', borderaxespad=0.5)
>> +        axs[1].set_xlim(0, 30)
>> +        axs[1].set_ylim(-0.4, 10.4)
>> +
>> +        lines.append(axs[2].plot([], [], label="avg_n_flows")[0])
>> +        lines.append(axs[2].plot([], [], label="max_n_flows")[0])
>> +        lines.append(axs[2].plot([], [], label="flow_limit")[0])
>> +        axs[2].legend(bbox_to_anchor=(1, 1), loc='upper left',
>> +                      borderaxespad=0.5)
>> +        axs[2].set_xlim(0, 30)
>> +        axs[2].set_ylim(-600, 15600)
>> +
>> +        lines.append(axs[3].plot([], [], label="poll_wait")[0])
>> +        lines.append(axs[3].plot([], [], label="actual_wait")[0])
>> +        axs[3].legend(bbox_to_anchor=(1, 1), loc='upper left',
>> +                      borderaxespad=0.5)
>> +        axs[3].set_xlim(0, 30)
>> +        axs[3].set_ylim(-20, 520)
>> +
>> +        fig.tight_layout()
>> +
>> +        plt.ion()
>> +        plt.show()
>> +    else:
>> +        fig = None
>> +        axs = None
>> +        lines = None
>> +
>> +    graph_data = {"base_time": 0,
>> +                  "l_index": 0,
>> +                  "fig": fig,
>> +                  "axs": axs,
>> +                  "lines": lines,
>> +                  "last_update": 0,
>> +                  "ukey_count": 0,
>> +                  "time": [],
>> +                  "n_flows": [],
>> +                  "n_reval_flows": [],
>> +                  "avg_n_flows": [],
>> +                  "max_n_flows": [],
>> +                  "flow_limit": [],
>> +                  "dump_duration": [],
>> +                  "poll_wait": [],
>> +                  "actual_wait": []}
>> +
>> +    return namedtuple("GraphData", graph_data.keys())(*graph_data.values())
>> +
>> +
>> +#
>> +# dynamic_plot_update()
>> +#
>> +def dynamic_plot_update(graph_data, refresh=1):
>> +
>> +    if graph_data.last_update != 0 and \
>> +       (graph_data.time[-1] - graph_data.last_update) < refresh:
>> +        return graph_data
>> +
>> +    graph_data = graph_data._replace(last_update=graph_data.time[-1])
>> +
>> +    if (graph_data.time[-1] - graph_data.time[graph_data.l_index]) > 30:
>> +        for i in range(graph_data.l_index + 1, len(graph_data.time)):
>> +            if (graph_data.time[-1] - graph_data.time[i]) <= 30:
>> +                graph_data = graph_data._replace(l_index=i)
>> +                break
>> +
>> +    for line in graph_data.lines:
>> +        line.set_xdata(graph_data.time[graph_data.l_index:])
>> +
>> +    graph_data.lines[0].set_ydata(graph_data.n_flows[graph_data.l_index:])
>> +    graph_data.lines[1].set_ydata(
>> +        graph_data.n_reval_flows[graph_data.l_index:])
>> +    graph_data.lines[2].set_ydata(
>> +        graph_data.dump_duration[graph_data.l_index:])
>> +    
>> graph_data.lines[3].set_ydata(graph_data.avg_n_flows[graph_data.l_index:])
>> +    
>> graph_data.lines[4].set_ydata(graph_data.max_n_flows[graph_data.l_index:])
>> +    
>> graph_data.lines[5].set_ydata(graph_data.flow_limit[graph_data.l_index:])
>> +    graph_data.lines[6].set_ydata(graph_data.poll_wait[graph_data.l_index:])
>> +    
>> graph_data.lines[7].set_ydata(graph_data.actual_wait[graph_data.l_index:])
>> +
>> +    for ax in graph_data.axs:
>> +        if graph_data.l_index == 0:
>> +            ax.autoscale(enable=True, axis='y')
>> +        else:
>> +            ax.autoscale(enable=True)
>> +
>> +        ax.relim(visible_only=True)
>> +        ax.autoscale_view(tight=True, scalex=True, scaley=True)
>> +
>> +    try:
>> +        graph_data.fig.canvas.draw()
>> +        graph_data.fig.canvas.flush_events()
>> +    except KeyboardInterrupt:
>> +        return None
>> +
>> +    return graph_data
>> +
>> +
>> +#
>> +# show_graph()
>> +#
>> +def show_graph(graph_data, gui=False, file_name=None):
>> +
>> +    if len(graph_data.time) == 0 or (not gui and file_name is None):
>> +        return
>> +
>> +    plt.ioff()
>> +
>> +    fig, (nf_ax, dd_ax, f_ax, t_ax) = plt.subplots(4, figsize=(19, 10))
>> +    fig.suptitle('Revalidator Handling')
>> +    nf_ax.grid()
>> +    f_ax.grid()
>> +    dd_ax.grid()
>> +    t_ax.grid()
>> +
>> +    nf_ax.set_ylabel("Numer of flows", weight='bold')
>> +    f_ax.set_ylabel("Numer of flows", weight='bold')
>> +    dd_ax.set_ylabel("Time spend (ms)", weight='bold')
>> +    t_ax.set_ylabel("Time spend (ms)", weight='bold')
>> +    t_ax.set_xlabel("Time (seconds since start)", weight='bold')
>> +
>> +    nf_ax.plot(graph_data.time, graph_data.n_flows, label="n_flows")
>> +    nf_ax.plot(graph_data.time, graph_data.n_reval_flows,
>> +               label="n_reval_flows")
>> +    nf_ax.legend(bbox_to_anchor=(1, 1), loc='upper left', borderaxespad=0.5)
>> +
>> +    dd_ax.plot(graph_data.time, graph_data.dump_duration, color="orange",
>> +               label="dump_duration")
>> +    dd_ax.legend(bbox_to_anchor=(1, 1), loc='upper left', borderaxespad=0.5)
>> +
>> +    f_ax.plot(graph_data.time, graph_data.avg_n_flows, label="avg_n_flows")
>> +    f_ax.plot(graph_data.time, graph_data.max_n_flows, label="max_n_flows")
>> +    f_ax.plot(graph_data.time, graph_data.flow_limit, label="flow_limit")
>> +    f_ax.legend(bbox_to_anchor=(1, 1), loc='upper left', borderaxespad=0.5)
>> +
>> +    t_ax.plot(graph_data.time, graph_data.poll_wait, label="poll_wait")
>> +    t_ax.plot(graph_data.time, graph_data.actual_wait, label="actual_wait")
>> +    t_ax.legend(bbox_to_anchor=(1, 1), loc='upper left', borderaxespad=0.5)
>> +
>> +    fig.tight_layout()
>> +
>> +    if file_name is not None and file_name != "":
>> +        fig.savefig(file_name + '.png')
>> +
>> +    if gui:
>> +        try:
>> +            plt.show()
>> +        except KeyboardInterrupt:
>> +            pass
>> +
>> +    plt.close(fig)
>> +
>> +
>> +#
>> +# process_events_from_file()
>> +#
>> +def process_events_from_file(file_name):
>> +    try:
>> +        with open(file_name, 'r') as fd:
>> +            print("- Reading events from \"{}\"...".format(file_name))
>> +
>> +            print_csv_header()
>> +            for entry in fd:
>> +                entry.rstrip()
>> +                if entry.startswith('event = {'):
>> +                    event = ast.literal_eval(entry[8:])
>> +                    event = namedtuple("EventObject",
>> +                                       event.keys())(*event.values())
>> +                    try:
>> +                        _process_event(event)
>> +                    except KeyboardInterrupt:
>> +                        break
>> +
>> +    except (FileNotFoundError, PermissionError):
>> +        print("ERROR: Can't open file \"{}\" for 
>> reading!".format(file_name))
>> +        sys.exit(-1)
>> +
>> +    show_graph(graph, gui=not options.no_gui, 
>> file_name=options.write_charts)
>> +
>> +
>> +#
>> +# main()
>> +#
>> +def main():
>> +    #
>> +    # Don't like these globals, but ctx passing does not seem to work with 
>> the
>> +    # existing open_ring_buffer() API :(
>> +    #
>> +    global b
>> +    global export_file
>> +    global options
>> +    global state
>> +    global graph
>> +
>> +    #
>> +    # Argument parsing
>> +    #
>> +    parser = argparse.ArgumentParser()
>> +
>> +    parser.add_argument("-c", "--compress-output", action="store_true",
>> +                        help="Compress output, i.e. only dump changes in "
>> +                        "the dataset")
>> +    parser.add_argument("--buffer-page-count",
>> +                        help="Number of BPF ring buffer pages, default 
>> 1024",
>> +                        type=int, default=1024, metavar="NUMBER")
>> +    parser.add_argument("-D", "--debug",
>> +                        help="Enable eBPF debugging",
>> +                        type=int, const=0x3f, default=0, nargs='?')
>> +    parser.add_argument("-g", "--no-gui", action="store_true",
>> +                        help="Do use the gui to display plots")
>
> s/Do use/Do not use/

Thanks, I’ll send a quick v5 to speed thinks up.

>> +    parser.add_argument("--no-ukey-count", action="store_true",
>> +                        help="No revalidate_ukey() counting")
>> +    parser.add_argument("-p", "--pid", metavar="VSWITCHD_PID",
>> +                        help="ovs-vswitch's PID",
>> +                        type=int, default=None)
>> +    parser.add_argument("-P", "--pahole", metavar="PAHOLE",
>> +                        help="Pahole executable to use, default pahole",
>> +                        type=str, default="pahole")
>> +    parser.add_argument("-r", "--read-events",
>> +                        help="Read events from <FILE> instead of installing 
>> "
>> +                        "tracepoints", type=str, default=None, 
>> metavar="FILE")
>> +    parser.add_argument("-R", "--no-realtime-plots", action="store_true",
>> +                        help="Do not show realtime plot while tracing")
>> +    parser.add_argument("-u", "--update-interval",
>> +                        help="Seconds to wait between real time update, "
>> +                        "default 1", type=float, default=1, 
>> metavar="SECONDS")
>> +    parser.add_argument("-w", "--write-events",
>> +                        help="Write events to <FILE>",
>> +                        type=str, default=None, metavar="FILE")
>> +    parser.add_argument("-W", "--write-charts",
>> +                        help="Write overall charts to <FILE>.png",
>> +                        type=str, default=None, metavar="FILE")
>> +
>> +    options = parser.parse_args()
>> +
>> +    #
>> +    # Find the PID of the ovs-vswitchd daemon if not specified.
>> +    #
>> +    if options.pid is None and options.read_events is None:
>> +        for proc in psutil.process_iter():
>> +            if 'ovs-vswitchd' in proc.name():
>> +                if options.pid is not None:
>> +                    print("ERROR: Multiple ovs-vswitchd daemons running, "
>> +                          "use the -p option!")
>> +                    sys.exit(-1)
>> +
>> +                options.pid = proc.pid
>> +
>> +    #
>> +    # Error checking on input parameters
>> +    #
>> +    if options.pid is None and options.read_events is None:
>> +        print("ERROR: Failed to find ovs-vswitchd's PID!")
>> +        sys.exit(-1)
>> +
>> +    if options.read_events is not None and options.write_events is not None:
>> +        print("ERROR: Either supply the read or write events option, "
>> +              "not both!")
>> +        sys.exit(-1)
>> +
>> +    options.buffer_page_count = next_power_of_two(options.buffer_page_count)
>> +
>> +    #
>> +    # Define the state and graph
>> +    #
>> +    state = {"last_start": None,
>> +             "last_done": None,
>> +             "running": False,
>> +             "last_csv": None,
>> +             "last_not_printed_csv": None}
>> +
>> +    export_file = None
>> +
>> +    graph = dynamic_plot_init(real_time=(not options.no_gui
>> +                                         and not options.no_realtime_plots))
>> +
>> +    #
>> +    # Process events from file if required.
>> +    #
>> +    if options.read_events is not None:
>> +        process_events_from_file(options.read_events)
>> +        sys.exit(0)
>> +
>> +    #
>> +    # Open write handle if needed.
>> +    #
>> +    if options.write_events is not None:
>> +        try:
>> +            export_file = open(options.write_events, "w")
>> +        except (FileNotFoundError, IOError, PermissionError) as e:
>> +            print("ERROR: Can't create export file \"{}\": {}".format(
>> +                options.write_events, e.strerror))
>> +            sys.exit(-1)
>> +
>> +    #
>> +    # Attach the usdt probe
>> +    #
>> +    u = USDT(pid=int(options.pid))
>> +    try:
>> +        u.enable_probe(probe="start_dump", fn_name="probe__start_dump")
>> +        u.enable_probe(probe="sweep_done", fn_name="probe__sweep_done")
>> +        if not options.no_ukey_count:
>> +            u.enable_probe(probe="revalidate_ukey__:entry",
>> +                           fn_name="probe__reval_entry")
>> +    except USDTException as e:
>> +        print("ERROR: {}".format(
>> +            (re.sub('^', ' ' * 7, str(e), flags=re.MULTILINE)).strip().
>> +            replace("--with-dtrace or --enable-dtrace",
>> +                    "--enable-usdt-probes")))
>> +        sys.exit(-1)
>> +
>> +    #
>> +    # Attach probe to running process
>> +    #
>> +    source = EBPF_SOURCE.replace("<BUFFER_PAGE_CNT>",
>> +                                 str(options.buffer_page_count))
>> +    source = source.replace("<OVS_INCLUDE_DEFINITIONS>",
>> +                            get_ovs_definitions("udpif", pid=options.pid,
>> +                                                pahole=options.pahole))
>> +
>> +    b = BPF(text=source, usdt_contexts=[u], debug=options.debug)
>> +
>> +    #
>> +    # Print header
>> +    #
>> +    ltz = datetime.datetime.now()
>> +    utc = ltz.astimezone(pytz.utc)
>> +    time_string = "# Starting trace @{} ({} UTC)".format(
>> +        ltz.isoformat(), utc.strftime("%H:%M:%S"))
>> +
>> +    if export_file is not None:
>> +        export_file.write(time_string + "\n")
>> +
>> +    print(time_string)
>> +    print_csv_header()
>> +
>> +    #
>> +    # Process all events
>> +    b['events'].open_ring_buffer(process_event)
>> +    while 1:
>> +        try:
>> +            b.ring_buffer_poll()
>> +            time.sleep(0.5)
>> +        except KeyboardInterrupt:
>> +            break
>> +
>> +    dropcnt = b.get_table("dropcnt")
>> +    for k in dropcnt.keys():
>> +        count = dropcnt.sum(k).value
>> +        if k.value == 0 and count > 0:
>> +            print("\n# WARNING: Not all upcalls were captured, {} were "
>> +                  "dropped!\n#          Increase the BPF ring buffer size "
>> +                  "with the --buffer-page-count option.".format(count))
>> +
>> +    #
>> +    # Display footer.
>> +    #
>> +    if state["last_not_printed_csv"] is not None:
>> +        print(state["last_not_printed_csv"])
>> +
>> +    ltz = datetime.datetime.now()
>> +    utc = ltz.astimezone(pytz.utc)
>> +    time_string = "# Stopping trace @{} ({} UTC)".format(
>> +        ltz.isoformat(), utc.strftime("%H:%M:%S"))
>> +
>> +    if export_file is not None:
>> +        export_file.write(time_string + "\n")
>> +
>> +    print(time_string)
>> +
>> +    #
>> +    # Close event file is used.
>> +    #
>> +    if options.write_events is not None:
>> +        export_file.close()
>> +
>> +    #
>> +    # Do final graph if requested.
>> +    #
>> +    show_graph(graph, gui=not options.no_gui, 
>> file_name=options.write_charts)
>> +
>> +
>> +#
>> +# Start main() as the default entry point...
>> +#
>> +if __name__ == '__main__':
>> +    main()
>>
>
> -- 
> Adrián Moreno


_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v4] utilities: Add revalidator measurement script and needed USDT probes.

Reply via email to