On 22 Dec 2022, at 18:42, Adrian Moreno wrote:
> Hi Eelco, > > I am not done reviewing this patch. I'm still tyring to figure why I can't > make the pahole thing work so I'll continue tomorrow. For now, see some > questions below. Are you sure you are on the next branch, not on main, that was my issue earlier :) > On 12/22/22 09:55, Eelco Chaudron wrote: >> This patch adds a Python script that can be used to analyze the >> revalidator runs by providing statistics (including some real time >> graphs). >> >> The USDT events can also be captured to a file and used for >> later offline analysis. >> >> The following blog explains the Open vSwitch revalidator >> implementation and how this tool can help you understand what is >> happening in your system. >> >> https://developers.redhat.com/articles/2022/10/19/open-vswitch-revalidator-process-explained >> >> Signed-off-by: Eelco Chaudron <[email protected]> >> --- >> >> v2: Added note that script only works a with single datapath configured. >> v3: Updated patch to use pahole to get OVS structures dynamically from >> debug data. >> v4: Added script to usdt_SCRIPTS >> >> Documentation/topics/usdt-probes.rst | 84 +++ >> ofproto/ofproto-dpif-upcall.c | 11 >> utilities/automake.mk | 4 >> utilities/usdt-scripts/reval_monitor.py | 858 >> +++++++++++++++++++++++++++++++ >> 4 files changed, 956 insertions(+), 1 deletion(-) >> create mode 100755 utilities/usdt-scripts/reval_monitor.py >> >> diff --git a/Documentation/topics/usdt-probes.rst >> b/Documentation/topics/usdt-probes.rst >> index 7ce19aaed..bc250e723 100644 >> --- a/Documentation/topics/usdt-probes.rst >> +++ b/Documentation/topics/usdt-probes.rst >> @@ -214,6 +214,10 @@ Available probes in ``ovs_vswitchd``: >> - dpif_recv:recv_upcall >> - main:poll_block >> - main:run_start >> +- revalidate_ukey\_\_:entry >> +- revalidate_ukey\_\_:exit >> +- udpif_revalidator:start_dump >> +- udpif_revalidator:sweep_done >> dpif_netlink_operate\_\_:op_flow_del >> @@ -327,6 +331,7 @@ probe main:run_start >> ~~~~~~~~~~~~~~~~~~~~ >> **Description**: >> + >> The ovs-vswitchd's main process contains a loop that runs every time some >> work >> needs to be done. This probe gets triggered every time the loop starts >> from the >> beginning. See also the ``main:poll_block`` probe below. >> @@ -344,6 +349,7 @@ probe main:poll_block >> ~~~~~~~~~~~~~~~~~~~~~ >> **Description**: >> + >> The ovs-vswitchd's main process contains a loop that runs every time some >> work >> needs to be done. This probe gets triggered every time the loop is done, >> and >> it's about to wait for being re-started by a poll_block() call returning. >> @@ -358,6 +364,84 @@ See also the ``main:run_start`` probe above. >> - ``utilities/usdt-scripts/bridge_loop.bt`` >> +revalidate_ukey\_\_:entry >> +~~~~~~~~~~~~~~~~~~~~~~~~~ >> + >> +**Description**: >> + >> +This probe gets triggered on entry of the revalidate_ukey__() function. >> + > > Why not trace the begining and end of revalidate_ukey so we can also detect > when a flow is evicted because should_revalidate() returns false? Because the need for this trace event is to capture the actual revalidate of a ukey, not the potential one. If we need this we can add a more specific one, but I think Kevin’s probe will do this. > Also, what are your plans wrt Kevin's patch [1]. Is this one supposed to > superceed it? It introduces an enum to indicate the reason why a flow gets > evicted, which I think is a good idea. Should we modify the USDT probe > accordingly when the reason is added? I have not heard from Kevin since my comments on v6, I hope he will send a v7 soon based on feedback and the pahole solution being available. Kevin’s patch is related to individual ukey revalidation, this patch is more related to the overall revalidator characteristics. So I see no need to merge them into one, however, it would be nice for Kevin to send a new revision. > > [1] > https://patchwork.ozlabs.org/project/openvswitch/patch/[email protected]/ > >> +**Arguments**: >> + >> +- *arg0*: ``(struct udpif *) udpif`` >> +- *arg1*: ``(struct udpif_key *) ukey`` >> +- *arg2*: ``(uint16_t) tcp_flags`` >> +- *arg3*: ``(struct ofpbuf *) odp_actions`` >> +- *arg4*: ``(struct recirc_refs *) recircs`` >> +- *arg5*: ``(struct xlate_cache *) xcache`` >> + >> +**Script references**: >> + >> +- ``utilities/usdt-scripts/reval_monitor.py`` >> + >> + >> +revalidate_ukey\_\_:exit >> +~~~~~~~~~~~~~~~~~~~~~~~~ >> + >> +**Description**: >> + >> +This probe gets triggered right before the revalidate_ukey__() function >> exits. >> + >> +**Arguments**: >> + >> +- *arg0*: ``(struct udpif *) udpif`` >> +- *arg1*: ``(struct udpif_key *) ukey`` >> +- *arg2*: ``(enum reval_result) result`` >> + >> +**Script references**: >> + >> +*None* >> + >> + >> +udpif_revalidator:start_dump >> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> + >> +**Description**: >> + >> +The ovs-vswitchd's revalidator process contains a loop that runs every time >> +revalidation work is needed. This probe gets triggered every time the >> +dump phase has started. >> + >> +**Arguments**: >> + >> +- *arg0*: ``(struct udpif *) udpif`` >> +- *arg1*: ``(size_t) n_flows`` >> + >> +**Script references**: >> + >> +- ``utilities/usdt-scripts/reval_monitor.py`` >> + >> + >> +udpif_revalidator:sweep_done >> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> + >> +**Description**: >> + >> +The ovs-vswitchd's revalidator process contains a loop that runs every time >> +revalidation work is needed. This probe gets triggered every time the >> +sweep phase was completed. >> + >> +**Arguments**: >> + >> +- *arg0*: ``(struct udpif *) udpif`` >> +- *arg1*: ``(size_t) n_flows`` >> +- *arg2*: ``(unsigned) MIN(ofproto_max_idle, ofproto_max_revalidator)`` >> + >> +**Script references**: >> + >> +- ``utilities/usdt-scripts/reval_monitor.py`` >> + >> + >> Adding your own probes >> ---------------------- >> diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c >> index ad9635496..d078ccc87 100644 >> --- a/ofproto/ofproto-dpif-upcall.c >> +++ b/ofproto/ofproto-dpif-upcall.c >> @@ -42,6 +42,7 @@ >> #include "seq.h" >> #include "tunnel.h" >> #include "unixctl.h" >> +#include "openvswitch/usdt-probes.h" >> #include "openvswitch/vlog.h" >> #include "lib/netdev-provider.h" >> @@ -978,6 +979,7 @@ udpif_revalidator(void *arg) >> terse_dump = udpif_use_ufid(udpif); >> udpif->dump = dpif_flow_dump_create(udpif->dpif, >> terse_dump, >> NULL); >> + OVS_USDT_PROBE(udpif_revalidator, start_dump, udpif, >> n_flows); >> } >> } >> @@ -1029,6 +1031,9 @@ udpif_revalidator(void *arg) >> duration); >> } >> + OVS_USDT_PROBE(udpif_revalidator, sweep_done, udpif, n_flows, >> + MIN(ofproto_max_idle, ofproto_max_revalidator)); >> + >> poll_timer_wait_until(start_time + MIN(ofproto_max_idle, >> >> ofproto_max_revalidator)); >> seq_wait(udpif->reval_seq, last_reval_seq); >> @@ -2235,6 +2240,9 @@ revalidate_ukey__(struct udpif *udpif, const struct >> udpif_key *ukey, >> .wc = &wc, >> }; >> + OVS_USDT_PROBE(revalidate_ukey__, entry, udpif, ukey, tcp_flags, >> + odp_actions, recircs, xcache); >> + >> result = UKEY_DELETE; >> xoutp = NULL; >> netflow = NULL; >> @@ -2298,6 +2306,9 @@ exit: >> netflow_flow_clear(netflow, &ctx.flow); >> } >> xlate_out_uninit(xoutp); >> + >> + OVS_USDT_PROBE(revalidate_ukey__, exit, udpif, ukey, result); >> + >> return result; >> } >> diff --git a/utilities/automake.mk b/utilities/automake.mk >> index 132a16942..9c22ee94d 100644 >> --- a/utilities/automake.mk >> +++ b/utilities/automake.mk >> @@ -23,6 +23,7 @@ scripts_DATA += utilities/ovs-lib >> usdt_SCRIPTS += \ >> utilities/usdt-scripts/bridge_loop.bt \ >> utilities/usdt-scripts/upcall_cost.py \ >> + utilities/usdt-scripts/reval_monitor.py \ >> utilities/usdt-scripts/upcall_monitor.py >> completion_SCRIPTS += \ >> @@ -68,7 +69,8 @@ EXTRA_DIST += \ >> utilities/docker/debian/build-kernel-modules.sh \ >> utilities/usdt-scripts/bridge_loop.bt \ >> utilities/usdt-scripts/upcall_cost.py \ >> - utilities/usdt-scripts/upcall_monitor.py >> + utilities/usdt-scripts/upcall_monitor.py \ >> + utilities/usdt-scripts/reval_monitor.py >> MAN_ROOTS += \ >> utilities/ovs-testcontroller.8.in \ >> utilities/ovs-dpctl.8.in \ >> diff --git a/utilities/usdt-scripts/reval_monitor.py >> b/utilities/usdt-scripts/reval_monitor.py >> new file mode 100755 >> index 000000000..148f87320 >> --- /dev/null >> +++ b/utilities/usdt-scripts/reval_monitor.py >> @@ -0,0 +1,858 @@ >> +#!/usr/bin/env python3 >> +# >> +# Copyright (c) 2021 Red Hat, Inc. >> +# >> +# Licensed under the Apache License, Version 2.0 (the "License"); >> +# you may not use this file except in compliance with the License. >> +# You may obtain a copy of the License at: >> +# >> +# http://www.apache.org/licenses/LICENSE-2.0 >> +# >> +# Unless required by applicable law or agreed to in writing, software >> +# distributed under the License is distributed on an "AS IS" BASIS, >> +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. >> +# See the License for the specific language governing permissions and >> +# limitations under the License. >> +# >> +# Script information: >> +# ------------------- >> +# reval_monitor.py uses various user-defined tracepoints to get all the >> +# revalidator-process related variables and will display them in a (dynamic) >> +# graph. In addition, it will also dump the data to the console >> +# in a CSV format. Note that all the graphical output can be disabled. >> +# >> +# All the USDT events can be saved to a file and than can be used to >> +# replay the trace offline and still get the plots. >> +# >> +# The script can simple be invoked without any options, and it will try >> +# to find the running ovs-vswitchd instance: >> +# >> +# # ./reval_monitor.py >> +# # Starting trace @2022-09-20T04:07:43.588749 (08:07:43 UTC) >> +# ts_start, ts_complete, n_flows, n_reval_flows, avg_n_flows, max_n_flows, >> +# flow_limit, dump_duration, poll_wait, actual_wait >> +# 1741367714251645, 1741367714532545, 0, 0, 0, 10000, 69000, 1, 500, >> 500.52 >> +# 1741368215056961, 1741368215318223, 0, 0, 0, 10000, 69000, 1, 500, >> 500.55 >> +# 1741368715865871, 1741368716107089, 0, 0, 0, 10000, 69000, 1, 500, >> 499.48 >> +# ^C# Stopping trace @2022-09-20T04:07:49.893827 (08:07:49 UTC) >> +# >> +# IMPORTANT NOTE: This script only works when a single datapath is >> configured! >> +# >> +# The following are the available options: >> +# >> +# usage: reval_monitor.py [-h] [-c] [--buffer-page-count NUMBER] >> +# [-D [DEBUG]] [-g] [--no-ukey-count] >> +# [-p VSWITCHD_PID] [-P PAHOLE] [-r FILE] [-R] >> +# [-u SECONDS] [-w FILE] [-W FILE] >> +# >> +# options: >> +# -h, --help show this help message and exit >> +# -c, --compress-output >> +# Compress output, i.e. only dump changes in >> +# the dataset >> +# --buffer-page-count NUMBER >> +# Number of BPF ring buffer pages, default >> +# 1024 >> +# -D [DEBUG], --debug [DEBUG] >> +# Enable eBPF debugging >> +# -g, --no-gui Do use the gui to display plots >> +# --no-ukey-count No revalidate_ukey() counting >> +# -p VSWITCHD_PID, --pid VSWITCHD_PID >> +# ovs-vswitch's PID >> +# -P PAHOLE, --pahole PAHOLE >> +# Pahole executable to use, default pahole >> +# -r FILE, --read-events FILE >> +# Read events from <FILE> instead of >> +# installing tracepoints >> +# -R, --no-realtime-plots >> +# Do not show realtime plot while tracing >> +# -u SECONDS, --update-interval SECONDS >> +# Seconds to wait between real time update, >> +# default 1 >> +# -w FILE, --write-events FILE >> +# Write events to <FILE> >> +# -W FILE, --write-charts FILE >> +# Write overall charts to <FILE>.png >> + >> +# [-D [DEBUG]] [-g] [--no-ukey-count] >> +# [-p VSWITCHD_PID] [-r FILE] [-R] >> +# [-u SECONDS] [-w FILE] [-W FILE] >> +# >> +# The -g option disabled all GUI output of matplotlib, -R only disables the >> +# real-time plots. As real-time plots are rather slow, the -u option can be >> +# used to only update the graph every x seconds, which might speed up the >> +# processing. >> +# >> +# The --no-ukey-count option disables counting of the number of flows >> actually >> +# being revalidated against the current OpenFlow ruleset. This will not >> install >> +# the specific tracepoint which would be called for each flow being >> +# revalidated. >> +# >> +# What is plotted in the graphs (and dumped in the CSV output)? >> +# - n_flows: Number of flows active in the system. >> +# - n_reval_flows: Number of flows that where revalidated against the >> OpenFlow >> +# ruleset. >> +# - dump_duration: Time it took to dump and process all flows. >> +# - avg_n_flows: Average number of flows in the system. >> +# - max_n_flows: Maximum number of flows in the system. >> +# - flow_limit: Dynamic flow limit. >> +# - poll_wait: Time requested for the poll wait. >> +# - actual_wait: Time it took to be woken up. >> +# >> +# Dependencies: >> +# This script needs the 'readelf' binary to be available. In addition, it >> also >> +# needs pahole to be installed, and it needs a version that is equal or >> newer >> +# than the following commit on the next branch: >> +# >> +# https://git.kernel.org/pub/scm/devel/pahole/pahole.git/?h=next >> +# c55b13b9d785 ("WIP: Remove DW_TAG_atomic_type when encoding BTF") >> +# > > Should we give the users a default set of headers they can use if they cannot > install such a version of pahole? I guess this is what we wanted to avoid, i.e. no static headers defined anywhere. In addition, I also do not think we should add this add compile time as it adds pahole as a build dependency. I would assume if people need to use this script to get technical instead, they should be able to figure out how to get this from pahole on a development setup. >> +# To use a locally compiled pahole the --pahole option can be used. >> +# For example: >> +# # ./reval_monitor.py --pahole ~/pahole/build/pahole -g >> +# Starting trace @2022-12-20T14:57:26.077815 (13:57:26 UTC) >> +# ts_start, ts_complete, n_flows, n_reval_flows, avg_n_flows, >> max_n_flows, \ >> +# flow_limit, dump_duration, poll_wait, actual_wait >> +# 4202771850983494, 4202771851472838, 0, 0, 0, 0, 10000, 1, 500, 15.06 >> +# 4202771866531996, 4202771867713366, 0, 0, 0, 0, 10000, 1, 500, 4.23 >> +# 4202771871941979, 4202771872749915, 0, 0, 0, 0, 10000, 1, 500, 500.02 >> +# 4202772372770361, 4202772373531820, 0, 0, 0, 0, 10000, 1, 500, 499.96 >> +# 4202772873487942, 4202772874514753, 0, 0, 0, 0, 10000, 1, 500, 500.01 >> +# 4202773374528435, 4202773375695054, 0, 0, 0, 0, 10000, 1, 500, 500.01 >> +# 4202773875701559, 4202773876880763, 0, 0, 0, 0, 10000, 1, 500, 500.04 >> +# 4202774376925058, 4202774377905799, 0, 0, 0, 0, 10000, 1, 500, 500.03 >> +# ^C# Stopping trace @2022-12-20T14:57:40.391730 (13:57:40 UTC) >> +# >> + >> +try: >> + from bcc import BPF, USDT, USDTException >> +except ModuleNotFoundError: >> + print("WARNING: Can't find the BPF Compiler Collection (BCC) tools!") >> + print(" This is NOT problem if you analyzing previously >> collected" >> + " data.\n") >> + >> +from collections import namedtuple >> + >> +import argparse >> +import ast >> +import datetime >> +import re >> +import subprocess >> +import sys >> +import time >> + >> +import pytz >> +import psutil >> +import matplotlib.pyplot as plt >> + >> +# >> +# Actual eBPF source code >> +# >> +EBPF_SOURCE = """ >> +#include <linux/sched.h> >> + >> +<OVS_INCLUDE_DEFINITIONS> >> + >> +struct event_t { >> + u64 ts; >> + u32 pid; >> + u32 id; >> + u64 n_flows; >> + u32 avg_n_flows; >> + u32 max_n_flows; >> + u32 flow_limit; >> + u32 dump_duration; >> + u32 poll_wait; >> +}; >> + >> + >> +BPF_RINGBUF_OUTPUT(events, <BUFFER_PAGE_CNT>); >> +BPF_TABLE("percpu_array", uint32_t, uint64_t, dropcnt, 1); >> + >> +static struct event_t *get_event(uint32_t id) { >> + struct event_t *event = events.ringbuf_reserve(sizeof(struct event_t)); >> + >> + if (!event) { >> + uint32_t type = 0; >> + uint64_t *value = dropcnt.lookup(&type); >> + if (value) >> + __sync_fetch_and_add(value, 1); >> + >> + return NULL; >> + } >> + >> + event->id = id; >> + event->ts = bpf_ktime_get_ns(); >> + event->pid = bpf_get_current_pid_tgid(); >> + >> + return event; >> +} >> + >> +int probe__start_dump(struct pt_regs *ctx) { >> + struct event_t *event = get_event(0); >> + if (!event) >> + return 1; >> + >> + events.ringbuf_submit(event, 0); >> + return 0; >> +}; >> + >> +int probe__sweep_done(struct pt_regs *ctx) { >> + struct udpif udpif; >> + >> + bpf_usdt_readarg_p(1, ctx, &udpif, sizeof(udpif)); >> + >> + struct event_t *event = get_event(1); >> + if (!event) >> + return 1; >> + >> + event->avg_n_flows = udpif.avg_n_flows; >> + event->max_n_flows = udpif.max_n_flows; >> + event->flow_limit = udpif.flow_limit; >> + event->dump_duration = udpif.dump_duration; >> + >> + bpf_usdt_readarg(2, ctx, &event->n_flows); >> + bpf_usdt_readarg(3, ctx, &event->poll_wait); >> + >> + events.ringbuf_submit(event, 0); >> + return 0; >> +}; >> + >> +int probe__reval_entry(struct pt_regs *ctx) { >> + struct event_t *event = get_event(2); >> + if (!event) >> + return 1; >> + >> + events.ringbuf_submit(event, 0); >> + return 0; >> +}; >> +""" >> + >> + >> +# >> +# event_to_dict() >> +# >> +def event_to_dict(event): >> + event_dict = {} >> + >> + for field, _ in event._fields_: >> + if isinstance(getattr(event, field), (int, bytes)): >> + event_dict[field] = getattr(event, field) >> + >> + return event_dict >> + >> + >> +# >> +# print_csv_header() >> +# >> +def print_csv_header(): >> + print("ts_start, ts_complete, n_flows, n_reval_flows, avg_n_flows, " >> + "max_n_flows, flow_limit, dump_duration, poll_wait, actual_wait") >> + >> + >> +# >> +# process_event() >> +# >> +def process_event(ctx, data, size): >> + event = b['events'].event(data) >> + _process_event(event) >> + >> + >> +def _process_event(event): >> + global graph >> + >> + if export_file is not None: >> + export_file.write("event = {}\n".format(event_to_dict(event))) >> + >> + if event.id == 0 and not state['running']: >> + start = state["last_start"] >> + done = state["last_done"] >> + if done is not None and start is not None: >> + actual_wait = (event.ts - done.ts) / 1000000 >> + csv = "{}, {}, {}, {}, {}, {}, {}, {}, {}, {:.2f}".format( >> + start.ts, done.ts, done.n_flows, graph.ukey_count, >> + done.avg_n_flows, done.max_n_flows, done.flow_limit, >> + done.dump_duration, done.poll_wait, actual_wait) >> + >> + if graph.base_time == 0: >> + graph = graph._replace(base_time=done.ts) >> + >> + graph.time.append((done.ts - graph.base_time) / 1000000000) >> + graph.n_flows.append(done.n_flows) >> + graph.n_reval_flows.append(graph.ukey_count) >> + graph.avg_n_flows.append(done.avg_n_flows) >> + graph.max_n_flows.append(done.max_n_flows) >> + graph.flow_limit.append(done.flow_limit) >> + graph.dump_duration.append(done.dump_duration) >> + graph.poll_wait.append(done.poll_wait) >> + graph.actual_wait.append(actual_wait) >> + >> + if not options.no_gui and not options.no_realtime_plots: >> + updated_graph = dynamic_plot_update( >> + graph, refresh=options.update_interval) >> + if updated_graph is None: >> + raise KeyboardInterrupt >> + graph = updated_graph >> + >> + if options.compress_output: >> + last_csv = state["last_csv"] >> + if last_csv is None or \ >> + csv.split(",")[2:-1] != last_csv.split(",")[2:-1] or \ >> + abs((event.ts - done.ts) / 1000000 - done.poll_wait) > >> 100: >> + print(csv) >> + else: >> + state["last_not_printed_csv"] = csv >> + >> + state["last_csv"] = csv >> + else: >> + print(csv) >> + >> + state["last_start"] = event >> + state['running'] = True >> + graph = graph._replace(ukey_count=0) >> + elif event.id == 1 and state['running']: >> + state["last_done"] = event >> + state['running'] = False >> + elif event.id == 2 and state['running']: >> + graph = graph._replace(ukey_count=graph.ukey_count + 1) >> + >> + >> +# >> +# run_program() >> +# >> +def run_program(command, need_result=True): >> + try: >> + process = subprocess.run(command, >> + stdout=subprocess.PIPE, >> + stderr=subprocess.STDOUT, >> + encoding='utf8', >> + check=True) >> + >> + except subprocess.CalledProcessError as perror: >> + if need_result: >> + return perror.returncode, perror.stdout >> + >> + return perror.returncode >> + >> + if need_result: >> + return 0, process.stdout >> + >> + return 0 >> + >> + >> +# >> +# get_ovs_definitions() >> +# >> +def get_ovs_definitions(objects, pahole="pahole", pid="self"): >> + if not isinstance(objects, list): >> + objects = [objects] >> + >> + if len(objects) == 0: >> + raise ValueError("Must supply at least one object!") >> + >> + vswitchd = "/proc/{}/exe".format(str(pid)) >> + >> + object_str = "" >> + for obj in objects: >> + object_str += obj + ',' >> + >> + object_str = object_str.rstrip(',') >> + >> + def run_pahole(debug_file): >> + error, result = run_program([pahole, "-C", object_str, "--compile", >> + debug_file]) >> + >> + if error: >> + if "pahole: " + debug_file + ": Invalid argument" not in result: >> + print("ERROR: Pahole failed to get ovs-vswitchd data " >> + "structures!\n{}".format(re.sub('^', ' ' * 7, >> + result.rstrip(), >> + flags=re.MULTILINE))) >> + sys.exit(-1) >> + >> + return None >> + >> + if bool(re.search("pahole: type .* not found", result)): >> + return None >> + >> + return result >> + >> + def run_readelf(debug_file): >> + error, result = run_program(['readelf', "-n", >> + "--debug-dump=links", debug_file]) >> + >> + if error: >> + print("ERROR: Failed 'readelf' on \"{}\"!\n{}". >> + format(debug_file, re.sub('^', ' ' * 7, result, >> + flags=re.MULTILINE))) >> + sys.exit(-1) >> + >> + return result >> + >> + # >> + # First try to find the debug data as part of the executable >> + # >> + result = run_pahole(vswitchd) >> + >> + if result is None: >> + print("INFO: Failed to find debug info in \"{}\"!".format(vswitchd)) >> + >> + # >> + # Get additional debug and dwz information if available >> + # >> + >> + elf_result = run_readelf(vswitchd) >> + match = re.search("Build ID: ([0-9a-fA-F]+)", elf_result) >> + if match is None: >> + print("ERROR: Can't find build ID to read debug symbols!") >> + sys.exit(-1) >> + >> + dbg_file = "/usr/lib/debug/.build-id/{}/{}.debug".format( >> + match.group(1)[:2], match.group(1)[2:]) >> + >> + # >> + # First try to find data in default debug image >> + # >> + result = run_pahole(dbg_file) >> + >> + if result is None: >> + print("ERROR: Failed to find needed data structures through >> pahole!") >> + sys.exit(-1) >> + >> + # >> + # We need an empty _Atomic definition to avoid compiler complaints. >> + # >> + result = "#define _Atomic\n" + result >> + >> + # >> + # Remove the uint64_t definition as it conflicts with the kernel one. >> + # >> + result = re.sub("^typedef.*uint64_t;$", "", result, flags=re.MULTILINE) >> + >> + return result >> + >> + >> +# >> +# next_power_of_two() >> +# >> +def next_power_of_two(val): >> + np = 1 >> + while np < val: >> + np *= 2 >> + return np >> + >> + >> +# >> +# dynamic_plot_init() >> +# >> +def dynamic_plot_init(real_time=True): >> + >> + if real_time: >> + lines = [] >> + fig, axs = plt.subplots(4, figsize=(19, 10)) >> + fig.suptitle('Revalidator Handling') >> + for ax in axs: >> + ax.grid() >> + >> + axs[0].set_ylabel("Numer of flows", weight='bold') >> + axs[1].set_ylabel("Time spend (ms)", weight='bold') >> + axs[2].set_ylabel("Numer of flows", weight='bold') >> + axs[3].set_ylabel("Time spend (ms)", weight='bold') >> + axs[3].set_xlabel("Time (seconds since start)", weight='bold') >> + >> + lines.append(axs[0].plot([], [], label="n_flows", marker='o')[0]) >> + lines.append(axs[0].plot([], [], label="n_reval_flows")[0]) >> + axs[0].legend(bbox_to_anchor=(1, 1), loc='upper left', >> + borderaxespad=0.5) >> + axs[0].set_xlim(0, 30) >> + axs[0].set_ylim(-4, 104) >> + >> + lines.append(axs[1].plot([], [], color="orange", >> + label="dump_duration")[0]) >> + axs[1].legend(bbox_to_anchor=(1, 1), >> + loc='upper left', borderaxespad=0.5) >> + axs[1].set_xlim(0, 30) >> + axs[1].set_ylim(-0.4, 10.4) >> + >> + lines.append(axs[2].plot([], [], label="avg_n_flows")[0]) >> + lines.append(axs[2].plot([], [], label="max_n_flows")[0]) >> + lines.append(axs[2].plot([], [], label="flow_limit")[0]) >> + axs[2].legend(bbox_to_anchor=(1, 1), loc='upper left', >> + borderaxespad=0.5) >> + axs[2].set_xlim(0, 30) >> + axs[2].set_ylim(-600, 15600) >> + >> + lines.append(axs[3].plot([], [], label="poll_wait")[0]) >> + lines.append(axs[3].plot([], [], label="actual_wait")[0]) >> + axs[3].legend(bbox_to_anchor=(1, 1), loc='upper left', >> + borderaxespad=0.5) >> + axs[3].set_xlim(0, 30) >> + axs[3].set_ylim(-20, 520) >> + >> + fig.tight_layout() >> + >> + plt.ion() >> + plt.show() >> + else: >> + fig = None >> + axs = None >> + lines = None >> + >> + graph_data = {"base_time": 0, >> + "l_index": 0, >> + "fig": fig, >> + "axs": axs, >> + "lines": lines, >> + "last_update": 0, >> + "ukey_count": 0, >> + "time": [], >> + "n_flows": [], >> + "n_reval_flows": [], >> + "avg_n_flows": [], >> + "max_n_flows": [], >> + "flow_limit": [], >> + "dump_duration": [], >> + "poll_wait": [], >> + "actual_wait": []} >> + >> + return namedtuple("GraphData", graph_data.keys())(*graph_data.values()) >> + >> + >> +# >> +# dynamic_plot_update() >> +# >> +def dynamic_plot_update(graph_data, refresh=1): >> + >> + if graph_data.last_update != 0 and \ >> + (graph_data.time[-1] - graph_data.last_update) < refresh: >> + return graph_data >> + >> + graph_data = graph_data._replace(last_update=graph_data.time[-1]) >> + >> + if (graph_data.time[-1] - graph_data.time[graph_data.l_index]) > 30: >> + for i in range(graph_data.l_index + 1, len(graph_data.time)): >> + if (graph_data.time[-1] - graph_data.time[i]) <= 30: >> + graph_data = graph_data._replace(l_index=i) >> + break >> + >> + for line in graph_data.lines: >> + line.set_xdata(graph_data.time[graph_data.l_index:]) >> + >> + graph_data.lines[0].set_ydata(graph_data.n_flows[graph_data.l_index:]) >> + graph_data.lines[1].set_ydata( >> + graph_data.n_reval_flows[graph_data.l_index:]) >> + graph_data.lines[2].set_ydata( >> + graph_data.dump_duration[graph_data.l_index:]) >> + >> graph_data.lines[3].set_ydata(graph_data.avg_n_flows[graph_data.l_index:]) >> + >> graph_data.lines[4].set_ydata(graph_data.max_n_flows[graph_data.l_index:]) >> + >> graph_data.lines[5].set_ydata(graph_data.flow_limit[graph_data.l_index:]) >> + graph_data.lines[6].set_ydata(graph_data.poll_wait[graph_data.l_index:]) >> + >> graph_data.lines[7].set_ydata(graph_data.actual_wait[graph_data.l_index:]) >> + >> + for ax in graph_data.axs: >> + if graph_data.l_index == 0: >> + ax.autoscale(enable=True, axis='y') >> + else: >> + ax.autoscale(enable=True) >> + >> + ax.relim(visible_only=True) >> + ax.autoscale_view(tight=True, scalex=True, scaley=True) >> + >> + try: >> + graph_data.fig.canvas.draw() >> + graph_data.fig.canvas.flush_events() >> + except KeyboardInterrupt: >> + return None >> + >> + return graph_data >> + >> + >> +# >> +# show_graph() >> +# >> +def show_graph(graph_data, gui=False, file_name=None): >> + >> + if len(graph_data.time) == 0 or (not gui and file_name is None): >> + return >> + >> + plt.ioff() >> + >> + fig, (nf_ax, dd_ax, f_ax, t_ax) = plt.subplots(4, figsize=(19, 10)) >> + fig.suptitle('Revalidator Handling') >> + nf_ax.grid() >> + f_ax.grid() >> + dd_ax.grid() >> + t_ax.grid() >> + >> + nf_ax.set_ylabel("Numer of flows", weight='bold') >> + f_ax.set_ylabel("Numer of flows", weight='bold') >> + dd_ax.set_ylabel("Time spend (ms)", weight='bold') >> + t_ax.set_ylabel("Time spend (ms)", weight='bold') >> + t_ax.set_xlabel("Time (seconds since start)", weight='bold') >> + >> + nf_ax.plot(graph_data.time, graph_data.n_flows, label="n_flows") >> + nf_ax.plot(graph_data.time, graph_data.n_reval_flows, >> + label="n_reval_flows") >> + nf_ax.legend(bbox_to_anchor=(1, 1), loc='upper left', borderaxespad=0.5) >> + >> + dd_ax.plot(graph_data.time, graph_data.dump_duration, color="orange", >> + label="dump_duration") >> + dd_ax.legend(bbox_to_anchor=(1, 1), loc='upper left', borderaxespad=0.5) >> + >> + f_ax.plot(graph_data.time, graph_data.avg_n_flows, label="avg_n_flows") >> + f_ax.plot(graph_data.time, graph_data.max_n_flows, label="max_n_flows") >> + f_ax.plot(graph_data.time, graph_data.flow_limit, label="flow_limit") >> + f_ax.legend(bbox_to_anchor=(1, 1), loc='upper left', borderaxespad=0.5) >> + >> + t_ax.plot(graph_data.time, graph_data.poll_wait, label="poll_wait") >> + t_ax.plot(graph_data.time, graph_data.actual_wait, label="actual_wait") >> + t_ax.legend(bbox_to_anchor=(1, 1), loc='upper left', borderaxespad=0.5) >> + >> + fig.tight_layout() >> + >> + if file_name is not None and file_name != "": >> + fig.savefig(file_name + '.png') >> + >> + if gui: >> + try: >> + plt.show() >> + except KeyboardInterrupt: >> + pass >> + >> + plt.close(fig) >> + >> + >> +# >> +# process_events_from_file() >> +# >> +def process_events_from_file(file_name): >> + try: >> + with open(file_name, 'r') as fd: >> + print("- Reading events from \"{}\"...".format(file_name)) >> + >> + print_csv_header() >> + for entry in fd: >> + entry.rstrip() >> + if entry.startswith('event = {'): >> + event = ast.literal_eval(entry[8:]) >> + event = namedtuple("EventObject", >> + event.keys())(*event.values()) >> + try: >> + _process_event(event) >> + except KeyboardInterrupt: >> + break >> + >> + except (FileNotFoundError, PermissionError): >> + print("ERROR: Can't open file \"{}\" for >> reading!".format(file_name)) >> + sys.exit(-1) >> + >> + show_graph(graph, gui=not options.no_gui, >> file_name=options.write_charts) >> + >> + >> +# >> +# main() >> +# >> +def main(): >> + # >> + # Don't like these globals, but ctx passing does not seem to work with >> the >> + # existing open_ring_buffer() API :( >> + # >> + global b >> + global export_file >> + global options >> + global state >> + global graph >> + >> + # >> + # Argument parsing >> + # >> + parser = argparse.ArgumentParser() >> + >> + parser.add_argument("-c", "--compress-output", action="store_true", >> + help="Compress output, i.e. only dump changes in " >> + "the dataset") >> + parser.add_argument("--buffer-page-count", >> + help="Number of BPF ring buffer pages, default >> 1024", >> + type=int, default=1024, metavar="NUMBER") >> + parser.add_argument("-D", "--debug", >> + help="Enable eBPF debugging", >> + type=int, const=0x3f, default=0, nargs='?') >> + parser.add_argument("-g", "--no-gui", action="store_true", >> + help="Do use the gui to display plots") > > s/Do use/Do not use/ Thanks, I’ll send a quick v5 to speed thinks up. >> + parser.add_argument("--no-ukey-count", action="store_true", >> + help="No revalidate_ukey() counting") >> + parser.add_argument("-p", "--pid", metavar="VSWITCHD_PID", >> + help="ovs-vswitch's PID", >> + type=int, default=None) >> + parser.add_argument("-P", "--pahole", metavar="PAHOLE", >> + help="Pahole executable to use, default pahole", >> + type=str, default="pahole") >> + parser.add_argument("-r", "--read-events", >> + help="Read events from <FILE> instead of installing >> " >> + "tracepoints", type=str, default=None, >> metavar="FILE") >> + parser.add_argument("-R", "--no-realtime-plots", action="store_true", >> + help="Do not show realtime plot while tracing") >> + parser.add_argument("-u", "--update-interval", >> + help="Seconds to wait between real time update, " >> + "default 1", type=float, default=1, >> metavar="SECONDS") >> + parser.add_argument("-w", "--write-events", >> + help="Write events to <FILE>", >> + type=str, default=None, metavar="FILE") >> + parser.add_argument("-W", "--write-charts", >> + help="Write overall charts to <FILE>.png", >> + type=str, default=None, metavar="FILE") >> + >> + options = parser.parse_args() >> + >> + # >> + # Find the PID of the ovs-vswitchd daemon if not specified. >> + # >> + if options.pid is None and options.read_events is None: >> + for proc in psutil.process_iter(): >> + if 'ovs-vswitchd' in proc.name(): >> + if options.pid is not None: >> + print("ERROR: Multiple ovs-vswitchd daemons running, " >> + "use the -p option!") >> + sys.exit(-1) >> + >> + options.pid = proc.pid >> + >> + # >> + # Error checking on input parameters >> + # >> + if options.pid is None and options.read_events is None: >> + print("ERROR: Failed to find ovs-vswitchd's PID!") >> + sys.exit(-1) >> + >> + if options.read_events is not None and options.write_events is not None: >> + print("ERROR: Either supply the read or write events option, " >> + "not both!") >> + sys.exit(-1) >> + >> + options.buffer_page_count = next_power_of_two(options.buffer_page_count) >> + >> + # >> + # Define the state and graph >> + # >> + state = {"last_start": None, >> + "last_done": None, >> + "running": False, >> + "last_csv": None, >> + "last_not_printed_csv": None} >> + >> + export_file = None >> + >> + graph = dynamic_plot_init(real_time=(not options.no_gui >> + and not options.no_realtime_plots)) >> + >> + # >> + # Process events from file if required. >> + # >> + if options.read_events is not None: >> + process_events_from_file(options.read_events) >> + sys.exit(0) >> + >> + # >> + # Open write handle if needed. >> + # >> + if options.write_events is not None: >> + try: >> + export_file = open(options.write_events, "w") >> + except (FileNotFoundError, IOError, PermissionError) as e: >> + print("ERROR: Can't create export file \"{}\": {}".format( >> + options.write_events, e.strerror)) >> + sys.exit(-1) >> + >> + # >> + # Attach the usdt probe >> + # >> + u = USDT(pid=int(options.pid)) >> + try: >> + u.enable_probe(probe="start_dump", fn_name="probe__start_dump") >> + u.enable_probe(probe="sweep_done", fn_name="probe__sweep_done") >> + if not options.no_ukey_count: >> + u.enable_probe(probe="revalidate_ukey__:entry", >> + fn_name="probe__reval_entry") >> + except USDTException as e: >> + print("ERROR: {}".format( >> + (re.sub('^', ' ' * 7, str(e), flags=re.MULTILINE)).strip(). >> + replace("--with-dtrace or --enable-dtrace", >> + "--enable-usdt-probes"))) >> + sys.exit(-1) >> + >> + # >> + # Attach probe to running process >> + # >> + source = EBPF_SOURCE.replace("<BUFFER_PAGE_CNT>", >> + str(options.buffer_page_count)) >> + source = source.replace("<OVS_INCLUDE_DEFINITIONS>", >> + get_ovs_definitions("udpif", pid=options.pid, >> + pahole=options.pahole)) >> + >> + b = BPF(text=source, usdt_contexts=[u], debug=options.debug) >> + >> + # >> + # Print header >> + # >> + ltz = datetime.datetime.now() >> + utc = ltz.astimezone(pytz.utc) >> + time_string = "# Starting trace @{} ({} UTC)".format( >> + ltz.isoformat(), utc.strftime("%H:%M:%S")) >> + >> + if export_file is not None: >> + export_file.write(time_string + "\n") >> + >> + print(time_string) >> + print_csv_header() >> + >> + # >> + # Process all events >> + b['events'].open_ring_buffer(process_event) >> + while 1: >> + try: >> + b.ring_buffer_poll() >> + time.sleep(0.5) >> + except KeyboardInterrupt: >> + break >> + >> + dropcnt = b.get_table("dropcnt") >> + for k in dropcnt.keys(): >> + count = dropcnt.sum(k).value >> + if k.value == 0 and count > 0: >> + print("\n# WARNING: Not all upcalls were captured, {} were " >> + "dropped!\n# Increase the BPF ring buffer size " >> + "with the --buffer-page-count option.".format(count)) >> + >> + # >> + # Display footer. >> + # >> + if state["last_not_printed_csv"] is not None: >> + print(state["last_not_printed_csv"]) >> + >> + ltz = datetime.datetime.now() >> + utc = ltz.astimezone(pytz.utc) >> + time_string = "# Stopping trace @{} ({} UTC)".format( >> + ltz.isoformat(), utc.strftime("%H:%M:%S")) >> + >> + if export_file is not None: >> + export_file.write(time_string + "\n") >> + >> + print(time_string) >> + >> + # >> + # Close event file is used. >> + # >> + if options.write_events is not None: >> + export_file.close() >> + >> + # >> + # Do final graph if requested. >> + # >> + show_graph(graph, gui=not options.no_gui, >> file_name=options.write_charts) >> + >> + >> +# >> +# Start main() as the default entry point... >> +# >> +if __name__ == '__main__': >> + main() >> > > -- > Adrián Moreno _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
