Hi Jan, I had problems applying later patches in this series so just reviewing this one for now. I tried several revisions to apply them.
The second patch ([ovs-dev,v3,2/3] dpif-netdev: Detailed performance stats for PMDs ) fails with fatal: patch fragment without header at line 708: @@ -1073,6 +1155,12 @@ dpif_netdev_init(void) Overall not only is user-visible output is clearer but the code is also more consistent and easier to understand. I tested this patch by applying it to: 3728b3b Ben Pfaff 2017-11-20 Merge branch 'dpdk_merge' of https://github.com/istokes/ovs into HEAD These are the issues I did find: 1. make check #1159 "ofproto-dpif patch ports" consistently fails for me with this patch applied 2. ./utilities/checkpatch.py reports some line length issues: == Checking "dpif-netdev-perf/ovs-dev-v3-1-3-dpif-netdev-Refactor-PMD-performance-into-dpif-netdev-perf.patch" == ERROR: Too many signoffs; are you missing Co-authored-by lines? WARNING: Line length is >79-characters long #346 FILE: lib/dpif-netdev.c:544: struct ovs_refcount ref_cnt; /* Every reference must be refcount'ed. */ WARNING: Line has non-spaces leading whitespace WARNING: Line has trailing whitespace #347 FILE: lib/dpif-netdev.c:545: WARNING: Line length is >79-characters long #349 FILE: lib/dpif-netdev.c:547: * XPS disabled for this netdev. All static_tx_qid's are unique and less WARNING: Line has non-spaces leading whitespace WARNING: Line has trailing whitespace #352 FILE: lib/dpif-netdev.c:550: WARNING: Line length is >79-characters long WARNING: Line has trailing whitespace #413 FILE: lib/dpif-netdev.c:610: OVS_ALIGNED_VAR(CACHE_LINE_SIZE) struct pmd_perf_stats perf_stats; Lines checked: 976, Warnings: 8, Errors: 1 3. Does the new 'pmd' arg to pmd-stats-show interfere with the existing [dp] arg? sudo ./utilities/ovs-appctl dpif-netdev/pmd-stats-show -pmd 1 netdev@dpif-netdev "dpif-netdev/pmd-stats-show" command takes at most 2 arguments ovs-appctl: ovs-vswitchd: server returned an error Otherwise it looks like a really useful patch. And the remainder of the series more so. Thanks, Billy. > -----Original Message----- > From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev- > boun...@openvswitch.org] On Behalf Of Jan Scheurich > Sent: Tuesday, November 21, 2017 12:38 AM > To: 'ovs-dev@openvswitch.org' <ovs-dev@openvswitch.org> > Subject: [ovs-dev] [PATCH v3 1/3] dpif-netdev: Refactor PMD performance into > dpif-netdev-perf > > Add module dpif-netdev-perf to host all PMD performance-related data > structures and functions in dpif-netdev. Refactor the PMD stats handling in > dpif- > netdev and delegate whatever possible into the new module, using clean > interfaces to shield dpif-netdev from the implementation details. Accordingly, > the all PMD statistics members are moved from the main struct > dp_netdev_pmd_thread into a dedicated member of type struct pmd_perf_stats. > > Include Darrel's prior refactoring of PMD stats contained in [PATCH v5,2/3] > dpif- > netdev: Refactor some pmd stats: > > 1. The cycles per packet counts are now based on packets received rather than > packet passes through the datapath. > > 2. Packet counters are now kept for packets received and packets recirculated. > These are kept as separate counters for maintainability reasons. The cost of > incrementing these counters is negligible. These new counters are also > displayed to the user. > > 3. A display statistic is added for the average number of datapath passes per > packet. This should be useful for user debugging and understanding of packet > processing. > > 4. The user visible 'miss' counter is used for successful upcalls, rather > than the > sum of sucessful and unsuccessful upcalls. Hence, this becomes what user > historically understands by OVS 'miss upcall'. > The user display is annotated to make this clear as well. > > 5. The user visible 'lost' counter remains as failed upcalls, but is > annotated to > make it clear what the meaning is. > > 6. The enum pmd_stat_type is annotated to make the usage of the stats > counters clear. > > 7. The subtable lookup stats is renamed to make it clear that it relates to > masked > lookups. > > 8. The PMD stats test is updated to handle the new user stats of packets > received, packets recirculated and average number of datapath passes per > packet. > > On top of that introduce a "-pmd <core>" option to the PMD info commands to > filter the output for a single PMD. > > Signed-off-by: Jan Scheurich <jan.scheur...@ericsson.com> > Signed-off-by: Darrell Ball <dlu...@gmail.com> > > --- > lib/automake.mk | 2 + > lib/dpif-netdev-perf.c | 67 +++++++++ > lib/dpif-netdev-perf.h | 123 ++++++++++++++++ > lib/dpif-netdev.c | 371 > ++++++++++++++++++++----------------------------- > tests/pmd.at | 22 +-- > 5 files changed, 358 insertions(+), 227 deletions(-) create mode 100644 > lib/dpif- > netdev-perf.c create mode 100644 lib/dpif-netdev-perf.h > > diff --git a/lib/automake.mk b/lib/automake.mk index effe5b5..0b8df0e 100644 > --- a/lib/automake.mk > +++ b/lib/automake.mk > @@ -80,6 +80,8 @@ lib_libopenvswitch_la_SOURCES = \ > lib/dpdk.h \ > lib/dpif-netdev.c \ > lib/dpif-netdev.h \ > + lib/dpif-netdev-perf.c \ > + lib/dpif-netdev-perf.h \ > lib/dpif-provider.h \ > lib/dpif.c \ > lib/dpif.h \ > diff --git a/lib/dpif-netdev-perf.c b/lib/dpif-netdev-perf.c new file mode > 100644 > index 0000000..6a8f7c4 > --- /dev/null > +++ b/lib/dpif-netdev-perf.c > @@ -0,0 +1,67 @@ > +/* > + * Copyright (c) 2017 Ericsson AB. > + * > + * Licensed under the Apache License, Version 2.0 (the "License"); > + * you may not use this file except in compliance with the License. > + * You may obtain a copy of the License at: > + * > + * http://www.apache.org/licenses/LICENSE-2.0 > + * > + * Unless required by applicable law or agreed to in writing, software > + * distributed under the License is distributed on an "AS IS" BASIS, > + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or > implied. > + * See the License for the specific language governing permissions and > + * limitations under the License. > + */ > + > +#include <config.h> > + > +#ifdef DPDK_NETDEV > +#include <rte_cycles.h> > +#endif > + > +#include "openvswitch/dynamic-string.h" > +#include "openvswitch/vlog.h" > +#include "dpif-netdev-perf.h" > +#include "timeval.h" > + > +VLOG_DEFINE_THIS_MODULE(pmd_perf); > + > +void > +pmd_perf_stats_init(struct pmd_perf_stats *s) { > + memset(s, 0 , sizeof(*s)); > + s->start_ms = time_msec(); > +} > + > +void > +pmd_perf_read_counters(struct pmd_perf_stats *s, > + uint64_t stats[PMD_N_STATS]) { > + uint64_t val; > + > + /* These loops subtracts reference values ('*_zero') from the counters. > + * Since loads and stores are relaxed, it might be possible for a > '*_zero' > + * value to be more recent than the current value we're reading from the > + * counter. This is not a big problem, since these numbers are not > + * supposed to be too accurate, but we should at least make sure that > + * the result is not negative. */ > + for (int i = 0; i < PMD_N_STATS; i++) { > + atomic_read_relaxed(&s->counters.n[i], &val); > + if (val > s->counters.zero[i]) { > + stats[i] = val - s->counters.zero[i]; > + } else { > + stats[i] = 0; > + } > + } > +} > + > +void > +pmd_perf_stats_clear(struct pmd_perf_stats *s) { > + s->start_ms = 0; /* Marks start of clearing. */ > + for (int i = 0; i < PMD_N_STATS; i++) { > + atomic_read_relaxed(&s->counters.n[i], &s->counters.zero[i]); > + } > + s->start_ms = time_msec(); /* Clearing finished. */ } > + > diff --git a/lib/dpif-netdev-perf.h b/lib/dpif-netdev-perf.h new file mode > 100644 > index 0000000..f55f7a2 > --- /dev/null > +++ b/lib/dpif-netdev-perf.h > @@ -0,0 +1,123 @@ > +/* > + * Copyright (c) 2017 Ericsson AB. > + * > + * Licensed under the Apache License, Version 2.0 (the "License"); > + * you may not use this file except in compliance with the License. > + * You may obtain a copy of the License at: > + * > + * http://www.apache.org/licenses/LICENSE-2.0 > + * > + * Unless required by applicable law or agreed to in writing, software > + * distributed under the License is distributed on an "AS IS" BASIS, > + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or > implied. > + * See the License for the specific language governing permissions and > + * limitations under the License. > + */ > + > +#ifndef PMD_PERF_H > +#define PMD_PERF_H 1 > + > +#include <stdbool.h> > +#include <stddef.h> > +#include <stdint.h> > +#include <string.h> > +#include <math.h> > + > +#include "openvswitch/vlog.h" > +#include "ovs-atomic.h" > +#include "timeval.h" > +#include "unixctl.h" > +#include "util.h" > + > +#ifdef __cplusplus > +extern "C" { > +#endif > + > +enum pmd_stat_type { > + PMD_STAT_EXACT_HIT, /* Packets that had an exact match (emc). */ > + PMD_STAT_MASKED_HIT, /* Packets that matched in the flow table. */ > + PMD_STAT_MISS, /* Packets that did not match and upcall was ok. > */ > + PMD_STAT_LOST, /* Packets that did not match and upcall failed. > */ > + /* The above statistics account for the total > + * number of packet passes through the datapath > + * pipeline and should not be overlapping with > each > + * other. */ > + PMD_STAT_MASKED_LOOKUP, /* Number of subtable lookups for flow table > + hits. Each MASKED_HIT hit will have >= 1 > + MASKED_LOOKUP(s). */ > + PMD_STAT_RECV, /* Packets entering the datapath pipeline from an > + * interface. */ > + PMD_STAT_RECIRC, /* Packets reentering the datapath pipeline due > to > + * recirculation. */ > + PMD_CYCLES_POLL_IDLE, /* Cycles spent unsuccessful polling. */ > + PMD_CYCLES_POLL_BUSY, /* Cycles spent successfully polling and > + * processing polled packets. */ > + PMD_CYCLES_OVERHEAD, /* Cycles spent for other tasks. */ > + PMD_CYCLES_UPCALL, /* Cycles spent in upcalls. Included in > + PMD_CYCLES_POLL_BUSY. */ > + PMD_CYCLES_ITER_IDLE, /* Cycles spent in idle iterations. */ > + PMD_CYCLES_ITER_BUSY, /* Cycles spent in busy iterations. */ > + PMD_N_STATS > +}; > + > +struct pmd_counters { > + atomic_uint64_t n[PMD_N_STATS]; /* Indexed by PMD_STAT_*. */ > + uint64_t zero[PMD_N_STATS]; > +}; > + > +struct pmd_perf_stats { > + uint64_t start_ms; > + uint64_t last_tsc; > + struct pmd_counters counters; > +}; > + > +enum pmd_info_type; > +struct dp_netdev_pmd_thread; > + > +struct pmd_perf_params { > + int command_type; > + bool histograms; > + size_t iter_hist_len; > + size_t ms_hist_len; > +}; > + > +void pmd_perf_stats_init(struct pmd_perf_stats *s); void > +pmd_perf_stats_clear(struct pmd_perf_stats *s); void > +pmd_perf_read_counters(struct pmd_perf_stats *s, > + uint64_t stats[PMD_N_STATS]); > + > +static inline void > +pmd_perf_update_counter(struct pmd_perf_stats *s, > + enum pmd_stat_type counter, int delta) { > + uint64_t tmp; > + atomic_read_relaxed(&s->counters.n[counter], &tmp); > + tmp += delta; > + atomic_store_relaxed(&s->counters.n[counter], tmp); } > + > +static inline void > +pmd_perf_start_iteration(struct pmd_perf_stats *s, uint64_t now_tsc) { > + if (OVS_UNLIKELY(s->start_ms == 0)) { > + /* Stats not yet fully initialised. */ > + return; > + } > + s->last_tsc = now_tsc; > +} > + > +static inline void > +pmd_perf_end_iteration(struct pmd_perf_stats *s, uint64_t now_tsc, > + int packets) > +{ > + uint64_t cycles = now_tsc - s->last_tsc; > + > + /* No need for atomic updates as only invoked within PMD thread. */ > + if (packets > 0) { > + s->counters.n[PMD_CYCLES_ITER_BUSY] += cycles; > + } else { > + s->counters.n[PMD_CYCLES_ITER_IDLE] += cycles; > + } > +} > + > +#endif /* pmd-perf.h */ > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 0a62630..bf99f81 > 100644 > --- a/lib/dpif-netdev.c > +++ b/lib/dpif-netdev.c > @@ -43,6 +43,7 @@ > #include "csum.h" > #include "dp-packet.h" > #include "dpif.h" > +#include "dpif-netdev-perf.h" > #include "dpif-provider.h" > #include "dummy.h" > #include "fat-rwlock.h" > @@ -330,23 +331,6 @@ static struct dp_netdev_port > *dp_netdev_lookup_port(const struct dp_netdev *dp, > odp_port_t) > OVS_REQUIRES(dp->port_mutex); > > -enum dp_stat_type { > - DP_STAT_EXACT_HIT, /* Packets that had an exact match (emc). */ > - DP_STAT_MASKED_HIT, /* Packets that matched in the flow table. */ > - DP_STAT_MISS, /* Packets that did not match. */ > - DP_STAT_LOST, /* Packets not passed up to the client. */ > - DP_STAT_LOOKUP_HIT, /* Number of subtable lookups for flow table > - hits */ > - DP_N_STATS > -}; > - > -enum pmd_cycles_counter_type { > - PMD_CYCLES_IDLE, /* Cycles spent idle or unsuccessful polling > */ > - PMD_CYCLES_PROCESSING, /* Cycles spent successfully polling and > - * processing polled packets */ > - PMD_N_CYCLES > -}; > - > enum rxq_cycles_counter_type { > RXQ_CYCLES_PROC_CURR, /* Cycles spent successfully polling and > processing packets during the current @@ > -496,18 +480,6 > @@ struct dp_netdev_actions *dp_netdev_flow_get_actions( > const struct dp_netdev_flow *); > static void dp_netdev_actions_free(struct dp_netdev_actions *); > > -/* Contained by struct dp_netdev_pmd_thread's 'stats' member. */ -struct > dp_netdev_pmd_stats { > - /* Indexed by DP_STAT_*. */ > - atomic_ullong n[DP_N_STATS]; > -}; > - > -/* Contained by struct dp_netdev_pmd_thread's 'cycle' member. */ -struct > dp_netdev_pmd_cycles { > - /* Indexed by PMD_CYCLES_*. */ > - atomic_ullong n[PMD_N_CYCLES]; > -}; > - > struct polled_queue { > struct dp_netdev_rxq *rxq; > odp_port_t port_no; > @@ -566,20 +538,22 @@ struct dp_netdev_pmd_thread { > * need to be protected by 'non_pmd_mutex'. Every other instance > * will only be accessed by its own pmd thread. */ > OVS_ALIGNED_VAR(CACHE_LINE_SIZE) struct emc_cache flow_cache; > - struct ovs_refcount ref_cnt; /* Every reference must be refcount'ed. > */ > - > - /* Queue id used by this pmd thread to send packets on all netdevs if > - * XPS disabled for this netdev. All static_tx_qid's are unique and less > - * than 'cmap_count(dp->poll_threads)'. */ > - uint32_t static_tx_qid; > - > - /* Flow-Table and classifiers > - * > - * Writers of 'flow_table' must take the 'flow_mutex'. Corresponding > - * changes to 'classifiers' must be made while still holding the > - * 'flow_mutex'. > - */ > - struct ovs_mutex flow_mutex; > + PADDED_MEMBERS(CACHE_LINE_SIZE, > + struct ovs_refcount ref_cnt; /* Every reference must be > refcount'ed. */ > + > + /* Queue id used by this pmd thread to send packets on all netdevs if > + * XPS disabled for this netdev. All static_tx_qid's are unique and > less > + * than 'cmap_count(dp->poll_threads)'. */ > + uint32_t static_tx_qid; > + > + /* Flow-Table and classifiers > + * > + * Writers of 'flow_table' must take the 'flow_mutex'. Corresponding > + * changes to 'classifiers' must be made while still holding the > + * 'flow_mutex'. > + */ > + struct ovs_mutex flow_mutex; > + ); > PADDED_MEMBERS(CACHE_LINE_SIZE, > struct cmap flow_table OVS_GUARDED; /* Flow table. */ > > @@ -591,20 +565,10 @@ struct dp_netdev_pmd_thread { > are stored for each polled rxq. */ > long long int rxq_next_cycle_store; > > - /* Cycles counters */ > - struct dp_netdev_pmd_cycles cycles; > - > /* Used to count cycles. See 'cycles_counter_end()'. */ > unsigned long long last_cycles; > struct latch exit_latch; /* For terminating the pmd thread. */ > - ); > - > - PADDED_MEMBERS(CACHE_LINE_SIZE, > - /* Statistics. */ > - struct dp_netdev_pmd_stats stats; > > - struct seq *reload_seq; > - uint64_t last_reload_seq; > atomic_bool reload; /* Do we need to reload ports? */ > bool isolated; > > @@ -612,11 +576,11 @@ struct dp_netdev_pmd_thread { > bool need_reload; > /* 5 pad bytes. */ > ); > - > PADDED_MEMBERS(CACHE_LINE_SIZE, > + struct seq *reload_seq; > + uint64_t last_reload_seq; > struct ovs_mutex port_mutex; /* Mutex for 'poll_list' > and 'tx_ports'. */ > - /* 16 pad bytes. */ > ); > PADDED_MEMBERS(CACHE_LINE_SIZE, > /* List of rx queues to poll. */ @@ -642,15 +606,8 @@ struct > dp_netdev_pmd_thread { > struct hmap send_port_cache; > ); > > - PADDED_MEMBERS(CACHE_LINE_SIZE, > - /* Only a pmd thread can write on its own 'cycles' and 'stats'. > - * The main thread keeps 'stats_zero' and 'cycles_zero' as base > - * values and subtracts them from 'stats' and 'cycles' before > - * reporting to the user */ > - unsigned long long stats_zero[DP_N_STATS]; > - uint64_t cycles_zero[PMD_N_CYCLES]; > - /* 8 pad bytes. */ > - ); > + /* Keep track of detailed PMD performance statistics. */ > + OVS_ALIGNED_VAR(CACHE_LINE_SIZE) struct pmd_perf_stats perf_stats; > }; > > /* Interface to netdev-based datapath. */ @@ -813,46 +770,11 @@ enum > pmd_info_type { }; > > static void > -pmd_info_show_stats(struct ds *reply, > - struct dp_netdev_pmd_thread *pmd, > - unsigned long long stats[DP_N_STATS], > - uint64_t cycles[PMD_N_CYCLES]) > +format_pmd_thread(struct ds *reply, > + struct dp_netdev_pmd_thread *pmd) > { > - unsigned long long total_packets; > - uint64_t total_cycles = 0; > - int i; > - > - /* These loops subtracts reference values ('*_zero') from the counters. > - * Since loads and stores are relaxed, it might be possible for a > '*_zero' > - * value to be more recent than the current value we're reading from the > - * counter. This is not a big problem, since these numbers are not > - * supposed to be too accurate, but we should at least make sure that > - * the result is not negative. */ > - for (i = 0; i < DP_N_STATS; i++) { > - if (stats[i] > pmd->stats_zero[i]) { > - stats[i] -= pmd->stats_zero[i]; > - } else { > - stats[i] = 0; > - } > - } > - > - /* Sum of all the matched and not matched packets gives the total. */ > - total_packets = stats[DP_STAT_EXACT_HIT] + stats[DP_STAT_MASKED_HIT] > - + stats[DP_STAT_MISS]; > - > - for (i = 0; i < PMD_N_CYCLES; i++) { > - if (cycles[i] > pmd->cycles_zero[i]) { > - cycles[i] -= pmd->cycles_zero[i]; > - } else { > - cycles[i] = 0; > - } > - > - total_cycles += cycles[i]; > - } > - > ds_put_cstr(reply, (pmd->core_id == NON_PMD_CORE_ID) > ? "main thread" : "pmd thread"); > - > if (pmd->numa_id != OVS_NUMA_UNSPEC) { > ds_put_format(reply, " numa_id %d", pmd->numa_id); > } > @@ -860,16 +782,39 @@ pmd_info_show_stats(struct ds *reply, > ds_put_format(reply, " core_id %u", pmd->core_id); > } > ds_put_cstr(reply, ":\n"); > +} > + > +static void > +pmd_info_show_stats(struct ds *reply, > + struct dp_netdev_pmd_thread *pmd) { > + uint64_t stats[PMD_N_STATS]; > + uint64_t total_cycles = 0; > + > + pmd_perf_read_counters(&pmd->perf_stats, stats); > + total_cycles = stats[PMD_CYCLES_ITER_IDLE] > + + stats[PMD_CYCLES_ITER_BUSY]; > + > + format_pmd_thread(reply, pmd); > > ds_put_format(reply, > - "\temc hits:%llu\n\tmegaflow hits:%llu\n" > - "\tavg. subtable lookups per hit:%.2f\n" > - "\tmiss:%llu\n\tlost:%llu\n", > - stats[DP_STAT_EXACT_HIT], stats[DP_STAT_MASKED_HIT], > - stats[DP_STAT_MASKED_HIT] > 0 > - ? (1.0*stats[DP_STAT_LOOKUP_HIT])/stats[DP_STAT_MASKED_HIT] > - : 0, > - stats[DP_STAT_MISS], stats[DP_STAT_LOST]); > + "\tpackets received:%"PRIu64"\n" > + "\tpacket recirculations:%"PRIu64"\n" > + "\tavg. datapath passes per packet:%.2f\n" > + "\temc hits:%"PRIu64"\n" > + "\tmegaflow hits:%"PRIu64"\n" > + "\tavg. subtable lookups per megaflow hit:%.2f\n" > + "\tmiss(i.e. lookup miss with success upcall):%"PRIu64"\n" > + "\tlost(i.e. lookup miss with failed upcall):%"PRIu64"\n", > + stats[PMD_STAT_RECV], stats[PMD_STAT_RECIRC], > + stats[PMD_STAT_RECV] > 0 > + ? (1.0 * (stats[PMD_STAT_RECV] + stats[PMD_STAT_RECIRC])) > + / stats[PMD_STAT_RECV] : 0, > + stats[PMD_STAT_EXACT_HIT], stats[PMD_STAT_MASKED_HIT], > + stats[PMD_STAT_MASKED_HIT] > 0 > + ? (1.0 * stats[PMD_STAT_MASKED_LOOKUP]) > + / stats[PMD_STAT_MASKED_HIT] > + : 0, stats[PMD_STAT_MISS], stats[PMD_STAT_LOST]); > > if (total_cycles == 0) { > return; > @@ -878,46 +823,26 @@ pmd_info_show_stats(struct ds *reply, > ds_put_format(reply, > "\tidle cycles:%"PRIu64" (%.02f%%)\n" > "\tprocessing cycles:%"PRIu64" (%.02f%%)\n", > - cycles[PMD_CYCLES_IDLE], > - cycles[PMD_CYCLES_IDLE] / (double)total_cycles * 100, > - cycles[PMD_CYCLES_PROCESSING], > - cycles[PMD_CYCLES_PROCESSING] / (double)total_cycles * > 100); > + stats[PMD_CYCLES_ITER_IDLE], > + stats[PMD_CYCLES_ITER_IDLE] / (double)total_cycles * 100, > + stats[PMD_CYCLES_ITER_BUSY], > + stats[PMD_CYCLES_ITER_BUSY] / (double)total_cycles * > + 100); > > + uint64_t total_packets = stats[PMD_STAT_RECV]; > if (total_packets == 0) { > return; > } > > ds_put_format(reply, > - "\tavg cycles per packet: %.02f (%"PRIu64"/%llu)\n", > + "\tavg cycles per packet: %.02f > + (%"PRIu64"/%"PRIu64")\n", > total_cycles / (double)total_packets, > total_cycles, total_packets); > > ds_put_format(reply, > "\tavg processing cycles per packet: " > - "%.02f (%"PRIu64"/%llu)\n", > - cycles[PMD_CYCLES_PROCESSING] / (double)total_packets, > - cycles[PMD_CYCLES_PROCESSING], total_packets); > -} > - > -static void > -pmd_info_clear_stats(struct ds *reply OVS_UNUSED, > - struct dp_netdev_pmd_thread *pmd, > - unsigned long long stats[DP_N_STATS], > - uint64_t cycles[PMD_N_CYCLES]) > -{ > - int i; > - > - /* We cannot write 'stats' and 'cycles' (because they're written by other > - * threads) and we shouldn't change 'stats' (because they're used to > count > - * datapath stats, which must not be cleared here). Instead, we save the > - * current values and subtract them from the values to be displayed in > the > - * future */ > - for (i = 0; i < DP_N_STATS; i++) { > - pmd->stats_zero[i] = stats[i]; > - } > - for (i = 0; i < PMD_N_CYCLES; i++) { > - pmd->cycles_zero[i] = cycles[i]; > - } > + "%.02f (%"PRIu64"/%"PRIu64")\n", > + stats[PMD_CYCLES_POLL_BUSY] / (double)total_packets, > + stats[PMD_CYCLES_POLL_BUSY], total_packets); > } > > static int > @@ -1077,23 +1002,34 @@ dpif_netdev_pmd_info(struct unixctl_conn *conn, > int argc, const char *argv[], > struct ds reply = DS_EMPTY_INITIALIZER; > struct dp_netdev_pmd_thread **pmd_list; > struct dp_netdev *dp = NULL; > - size_t n; > enum pmd_info_type type = *(enum pmd_info_type *) aux; > + int core_id = -1; > + size_t n; > > ovs_mutex_lock(&dp_netdev_mutex); > > - if (argc == 2) { > - dp = shash_find_data(&dp_netdevs, argv[1]); > - } else if (shash_count(&dp_netdevs) == 1) { > - /* There's only one datapath */ > - dp = shash_first(&dp_netdevs)->data; > + while (argc > 1) { > + if (!strcmp(argv[1], "-pmd")) { > + core_id = strtol(argv[2], NULL, 10); > + argc -= 2; > + argv += 2; > + } else { > + dp = shash_find_data(&dp_netdevs, argv[1]); > + argc -= 1; > + argv += 1; > + } > } > > if (!dp) { > - ovs_mutex_unlock(&dp_netdev_mutex); > - unixctl_command_reply_error(conn, > - "please specify an existing datapath"); > - return; > + if (shash_count(&dp_netdevs) == 1) { > + /* There's only one datapath */ > + dp = shash_first(&dp_netdevs)->data; > + } else { > + ovs_mutex_unlock(&dp_netdev_mutex); > + unixctl_command_reply_error(conn, > + "please specify an existing > datapath"); > + return; > + } > } > > sorted_poll_thread_list(dp, &pmd_list, &n); @@ -1102,26 +1038,15 @@ > dpif_netdev_pmd_info(struct unixctl_conn *conn, int argc, const char *argv[], > if (!pmd) { > break; > } > - > + if (core_id != -1 && pmd->core_id != core_id) { > + continue; > + } > if (type == PMD_INFO_SHOW_RXQ) { > pmd_info_show_rxq(&reply, pmd); > - } else { > - unsigned long long stats[DP_N_STATS]; > - uint64_t cycles[PMD_N_CYCLES]; > - > - /* Read current stats and cycle counters */ > - for (size_t j = 0; j < ARRAY_SIZE(stats); j++) { > - atomic_read_relaxed(&pmd->stats.n[j], &stats[j]); > - } > - for (size_t j = 0; j < ARRAY_SIZE(cycles); j++) { > - atomic_read_relaxed(&pmd->cycles.n[j], &cycles[j]); > - } > - > - if (type == PMD_INFO_CLEAR_STATS) { > - pmd_info_clear_stats(&reply, pmd, stats, cycles); > - } else if (type == PMD_INFO_SHOW_STATS) { > - pmd_info_show_stats(&reply, pmd, stats, cycles); > - } > + } else if (type == PMD_INFO_CLEAR_STATS) { > + pmd_perf_stats_clear(&pmd->perf_stats); > + } else if (type == PMD_INFO_SHOW_STATS) { > + pmd_info_show_stats(&reply, pmd); > } > } > free(pmd_list); > @@ -1139,14 +1064,14 @@ dpif_netdev_init(void) > clear_aux = PMD_INFO_CLEAR_STATS, > poll_aux = PMD_INFO_SHOW_RXQ; > > - unixctl_command_register("dpif-netdev/pmd-stats-show", "[dp]", > - 0, 1, dpif_netdev_pmd_info, > + unixctl_command_register("dpif-netdev/pmd-stats-show", "[-pmd core] > [dp]", > + 0, 2, dpif_netdev_pmd_info, > (void *)&show_aux); > - unixctl_command_register("dpif-netdev/pmd-stats-clear", "[dp]", > - 0, 1, dpif_netdev_pmd_info, > + unixctl_command_register("dpif-netdev/pmd-stats-clear", "[-pmd core] > [dp]", > + 0, 2, dpif_netdev_pmd_info, > (void *)&clear_aux); > - unixctl_command_register("dpif-netdev/pmd-rxq-show", "[dp]", > - 0, 1, dpif_netdev_pmd_info, > + unixctl_command_register("dpif-netdev/pmd-rxq-show", "[-pmd core] [dp]", > + 0, 2, dpif_netdev_pmd_info, > (void *)&poll_aux); > unixctl_command_register("dpif-netdev/pmd-rxq-rebalance", "[dp]", > 0, 1, dpif_netdev_pmd_rebalance, @@ -1482,23 > +1407,19 @@ > dpif_netdev_get_stats(const struct dpif *dpif, struct dpif_dp_stats *stats) { > struct dp_netdev *dp = get_dp_netdev(dpif); > struct dp_netdev_pmd_thread *pmd; > + uint64_t pmd_stats[PMD_N_STATS]; > > - stats->n_flows = stats->n_hit = stats->n_missed = stats->n_lost = 0; > + stats->n_flows = stats->n_hit = > + stats->n_mask_hit = stats->n_missed = stats->n_lost = 0; > CMAP_FOR_EACH (pmd, node, &dp->poll_threads) { > - unsigned long long n; > stats->n_flows += cmap_count(&pmd->flow_table); > - > - atomic_read_relaxed(&pmd->stats.n[DP_STAT_MASKED_HIT], &n); > - stats->n_hit += n; > - atomic_read_relaxed(&pmd->stats.n[DP_STAT_EXACT_HIT], &n); > - stats->n_hit += n; > - atomic_read_relaxed(&pmd->stats.n[DP_STAT_MISS], &n); > - stats->n_missed += n; > - atomic_read_relaxed(&pmd->stats.n[DP_STAT_LOST], &n); > - stats->n_lost += n; > + pmd_perf_read_counters(&pmd->perf_stats, pmd_stats); > + stats->n_hit += pmd_stats[PMD_STAT_EXACT_HIT]; > + stats->n_mask_hit += pmd_stats[PMD_STAT_MASKED_HIT]; > + stats->n_missed += pmd_stats[PMD_STAT_MISS]; > + stats->n_lost += pmd_stats[PMD_STAT_LOST]; > } > stats->n_masks = UINT32_MAX; > - stats->n_mask_hit = UINT64_MAX; > > return 0; > } > @@ -3174,28 +3095,28 @@ cycles_count_start(struct dp_netdev_pmd_thread > *pmd) > /* Stop counting cycles and add them to the counter 'type' */ static inline > void > cycles_count_end(struct dp_netdev_pmd_thread *pmd, > - enum pmd_cycles_counter_type type) > + enum pmd_stat_type type) > OVS_RELEASES(&cycles_counter_fake_mutex) > OVS_NO_THREAD_SAFETY_ANALYSIS > { > unsigned long long interval = cycles_counter() - pmd->last_cycles; > > - non_atomic_ullong_add(&pmd->cycles.n[type], interval); > + pmd_perf_update_counter(&pmd->perf_stats, type, interval); > } > > /* Calculate the intermediate cycle result and add to the counter 'type' */ > static > inline void cycles_count_intermediate(struct dp_netdev_pmd_thread *pmd, > struct dp_netdev_rxq *rxq, > - enum pmd_cycles_counter_type type) > + enum pmd_stat_type type) > OVS_NO_THREAD_SAFETY_ANALYSIS > { > unsigned long long new_cycles = cycles_counter(); > unsigned long long interval = new_cycles - pmd->last_cycles; > pmd->last_cycles = new_cycles; > > - non_atomic_ullong_add(&pmd->cycles.n[type], interval); > - if (rxq && (type == PMD_CYCLES_PROCESSING)) { > + pmd_perf_update_counter(&pmd->perf_stats, type, interval); > + if (rxq && (type == PMD_CYCLES_POLL_BUSY)) { > /* Add to the amount of current processing cycles. */ > non_atomic_ullong_add(&rxq->cycles[RXQ_CYCLES_PROC_CURR], > interval); > } > @@ -3885,12 +3806,12 @@ dpif_netdev_run(struct dpif *dpif) > port->port_no); > cycles_count_intermediate(non_pmd, NULL, > process_packets > - ? PMD_CYCLES_PROCESSING > - : PMD_CYCLES_IDLE); > + ? PMD_CYCLES_POLL_BUSY > + : PMD_CYCLES_POLL_IDLE); > } > } > } > - cycles_count_end(non_pmd, PMD_CYCLES_IDLE); > + cycles_count_end(non_pmd, PMD_CYCLES_POLL_IDLE); > dpif_netdev_xps_revalidate_pmd(non_pmd, time_msec(), false); > ovs_mutex_unlock(&dp->non_pmd_mutex); > > @@ -4034,6 +3955,7 @@ static void * > pmd_thread_main(void *f_) > { > struct dp_netdev_pmd_thread *pmd = f_; > + struct pmd_perf_stats *s = &pmd->perf_stats; > unsigned int lc = 0; > struct polled_queue *poll_list; > bool exiting; > @@ -4069,13 +3991,17 @@ reload: > > cycles_count_start(pmd); > for (;;) { > + uint64_t iter_packets = 0; > + pmd_perf_start_iteration(s, pmd->last_cycles); > for (i = 0; i < poll_cnt; i++) { > process_packets = > dp_netdev_process_rxq_port(pmd, poll_list[i].rxq->rx, > poll_list[i].port_no); > cycles_count_intermediate(pmd, poll_list[i].rxq, > - process_packets ? PMD_CYCLES_PROCESSING > - : PMD_CYCLES_IDLE); > + process_packets > + ? PMD_CYCLES_POLL_BUSY > + : PMD_CYCLES_POLL_IDLE); > + iter_packets += process_packets; > } > > if (lc++ > 1024) { > @@ -4093,10 +4019,12 @@ reload: > if (reload) { > break; > } > + cycles_count_intermediate(pmd, NULL, PMD_CYCLES_OVERHEAD); > } > + pmd_perf_end_iteration(s, pmd->last_cycles, iter_packets); > } > > - cycles_count_end(pmd, PMD_CYCLES_IDLE); > + cycles_count_end(pmd, PMD_CYCLES_OVERHEAD); > > poll_cnt = pmd_load_queues_and_ports(pmd, &poll_list); > exiting = latch_is_set(&pmd->exit_latch); @@ -4549,6 +4477,7 @@ > dp_netdev_configure_pmd(struct dp_netdev_pmd_thread *pmd, struct > dp_netdev *dp, > } > cmap_insert(&dp->poll_threads, CONST_CAST(struct cmap_node *, &pmd- > >node), > hash_int(core_id, 0)); > + pmd_perf_stats_init(&pmd->perf_stats); > } > > static void > @@ -4746,13 +4675,6 @@ dp_netdev_flow_used(struct dp_netdev_flow > *netdev_flow, int cnt, int size, > atomic_store_relaxed(&netdev_flow->stats.tcp_flags, flags); } > > -static void > -dp_netdev_count_packet(struct dp_netdev_pmd_thread *pmd, > - enum dp_stat_type type, int cnt) > -{ > - non_atomic_ullong_add(&pmd->stats.n[type], cnt); > -} > - > static int > dp_netdev_upcall(struct dp_netdev_pmd_thread *pmd, struct dp_packet > *packet_, > struct flow *flow, struct flow_wildcards *wc, ovs_u128 > *ufid, @@ - > 4926,6 +4848,9 @@ emc_processing(struct dp_netdev_pmd_thread *pmd, > int i; > > atomic_read_relaxed(&pmd->dp->emc_insert_min, &cur_min); > + pmd_perf_update_counter(&pmd->perf_stats, > + md_is_valid ? PMD_STAT_RECIRC : PMD_STAT_RECV, > + cnt); > > DP_PACKET_BATCH_REFILL_FOR_EACH (i, cnt, packet, packets_) { > struct dp_netdev_flow *flow; > @@ -4974,24 +4899,24 @@ emc_processing(struct dp_netdev_pmd_thread > *pmd, > } > } > > - dp_netdev_count_packet(pmd, DP_STAT_EXACT_HIT, > - cnt - n_dropped - n_missed); > + pmd_perf_update_counter(&pmd->perf_stats, PMD_STAT_EXACT_HIT, > + cnt - n_dropped - n_missed); > > return dp_packet_batch_size(packets_); } > > -static inline void > +static inline int > handle_packet_upcall(struct dp_netdev_pmd_thread *pmd, > struct dp_packet *packet, > const struct netdev_flow_key *key, > struct ofpbuf *actions, struct ofpbuf *put_actions, > - int *lost_cnt, long long now) > + long long now) > { > struct ofpbuf *add_actions; > struct dp_packet_batch b; > struct match match; > ovs_u128 ufid; > - int error; > + int error = 0; > > match.tun_md.valid = false; > miniflow_expand(&key->mf, &match.flow); @@ -5005,8 +4930,7 @@ > handle_packet_upcall(struct dp_netdev_pmd_thread *pmd, > put_actions); > if (OVS_UNLIKELY(error && error != ENOSPC)) { > dp_packet_delete(packet); > - (*lost_cnt)++; > - return; > + return error; > } > > /* The Netlink encoding of datapath flow keys cannot express @@ -5046,6 > +4970,7 @@ handle_packet_upcall(struct dp_netdev_pmd_thread *pmd, > ovs_mutex_unlock(&pmd->flow_mutex); > emc_probabilistic_insert(pmd, key, netdev_flow); > } > + return error; > } > > static inline void > @@ -5067,7 +4992,7 @@ fast_path_processing(struct dp_netdev_pmd_thread > *pmd, > struct dpcls *cls; > struct dpcls_rule *rules[PKT_ARRAY_SIZE]; > struct dp_netdev *dp = pmd->dp; > - int miss_cnt = 0, lost_cnt = 0; > + int upcall_ok_cnt = 0, upcall_fail_cnt = 0; > int lookup_cnt = 0, add_lookup_cnt; > bool any_miss; > size_t i; > @@ -5109,9 +5034,14 @@ fast_path_processing(struct dp_netdev_pmd_thread > *pmd, > continue; > } > > - miss_cnt++; > - handle_packet_upcall(pmd, packet, &keys[i], &actions, > - &put_actions, &lost_cnt, now); > + int error = handle_packet_upcall(pmd, packet, &keys[i], > + &actions, &put_actions, now); > + > + if (OVS_UNLIKELY(error)) { > + upcall_fail_cnt++; > + } else { > + upcall_ok_cnt++; > + } > } > > ofpbuf_uninit(&actions); > @@ -5121,8 +5051,7 @@ fast_path_processing(struct dp_netdev_pmd_thread > *pmd, > DP_PACKET_BATCH_FOR_EACH (packet, packets_) { > if (OVS_UNLIKELY(!rules[i])) { > dp_packet_delete(packet); > - lost_cnt++; > - miss_cnt++; > + upcall_fail_cnt++; > } > } > } > @@ -5140,10 +5069,14 @@ fast_path_processing(struct > dp_netdev_pmd_thread *pmd, > dp_netdev_queue_batches(packet, flow, &keys[i].mf, batches, > n_batches); > } > > - dp_netdev_count_packet(pmd, DP_STAT_MASKED_HIT, cnt - miss_cnt); > - dp_netdev_count_packet(pmd, DP_STAT_LOOKUP_HIT, lookup_cnt); > - dp_netdev_count_packet(pmd, DP_STAT_MISS, miss_cnt); > - dp_netdev_count_packet(pmd, DP_STAT_LOST, lost_cnt); > + pmd_perf_update_counter(&pmd->perf_stats, PMD_STAT_MASKED_HIT, > + cnt - upcall_ok_cnt - upcall_fail_cnt); > + pmd_perf_update_counter(&pmd->perf_stats, > PMD_STAT_MASKED_LOOKUP, > + lookup_cnt); > + pmd_perf_update_counter(&pmd->perf_stats, PMD_STAT_MISS, > + upcall_ok_cnt); > + pmd_perf_update_counter(&pmd->perf_stats, PMD_STAT_LOST, > + upcall_fail_cnt); > } > > /* Packets enter the datapath from a port (or from recirculation) here. > diff --git a/tests/pmd.at b/tests/pmd.at index e39a23a..0356f87 100644 > --- a/tests/pmd.at > +++ b/tests/pmd.at > @@ -170,13 +170,16 @@ dummy@ovs-dummy: hit:0 missed:0 > p0 7/1: (dummy-pmd: configured_rx_queues=4, > configured_tx_queues=<cleared>, requested_rx_queues=4, > requested_tx_queues=<cleared>) > ]) > > -AT_CHECK([ovs-appctl dpif-netdev/pmd-stats-show | sed > SED_NUMA_CORE_PATTERN | sed '/cycles/d' | grep pmd -A 5], [0], [dnl > +AT_CHECK([ovs-appctl dpif-netdev/pmd-stats-show | sed > +SED_NUMA_CORE_PATTERN | sed '/cycles/d' | grep pmd -A 8], [0], [dnl > pmd thread numa_id <cleared> core_id <cleared>: > + packets received:0 > + packet recirculations:0 > + avg. datapath passes per packet:0.00 > emc hits:0 > megaflow hits:0 > - avg. subtable lookups per hit:0.00 > - miss:0 > - lost:0 > + avg. subtable lookups per megaflow hit:0.00 > + miss(i.e. lookup miss with success upcall):0 > + lost(i.e. lookup miss with failed upcall):0 > ]) > > ovs-appctl time/stop > @@ -197,13 +200,16 @@ AT_CHECK([cat ovs-vswitchd.log | filter_flow_install > | strip_xout], [0], [dnl > recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:77,dst=50 > :54:00:00:01:78),eth_type(0x0800),ipv4(frag=no), actions: <del> > ]) > > -AT_CHECK([ovs-appctl dpif-netdev/pmd-stats-show | sed > SED_NUMA_CORE_PATTERN | sed '/cycles/d' | grep pmd -A 5], [0], [dnl > +AT_CHECK([ovs-appctl dpif-netdev/pmd-stats-show | sed > +SED_NUMA_CORE_PATTERN | sed '/cycles/d' | grep pmd -A 8], [0], [dnl > pmd thread numa_id <cleared> core_id <cleared>: > + packets received:20 > + packet recirculations:0 > + avg. datapath passes per packet:1.00 > emc hits:19 > megaflow hits:0 > - avg. subtable lookups per hit:0.00 > - miss:1 > - lost:0 > + avg. subtable lookups per megaflow hit:0.00 > + miss(i.e. lookup miss with success upcall):1 > + lost(i.e. lookup miss with failed upcall):0 > ]) > > OVS_VSWITCHD_STOP > -- > 1.9.1 > > _______________________________________________ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev