On Tue, Sep 29, 2020 at 05:27:31PM +0200, Ilya Maximets wrote: > On 9/15/20 9:21 PM, Flavio Leitner wrote: > > When the number of flows in the datapath reaches twice the > > maximum, revalidators will delete all flows as an emergency > > action to recover. In that case, log a message with values > > and increase a coverage counter. > > > > Signed-off-by: Flavio Leitner <[email protected]> > > --- > > ofproto/ofproto-dpif-upcall.c | 15 ++++++++++++++- > > 1 file changed, 14 insertions(+), 1 deletion(-) > > > > diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c > > index 05a912f57..12e94287a 100644 > > --- a/ofproto/ofproto-dpif-upcall.c > > +++ b/ofproto/ofproto-dpif-upcall.c > > @@ -57,6 +57,7 @@ COVERAGE_DEFINE(upcall_ukey_contention); > > COVERAGE_DEFINE(upcall_ukey_replace); > > COVERAGE_DEFINE(revalidate_missed_dp_flow); > > COVERAGE_DEFINE(upcall_flow_limit_hit); > > +COVERAGE_DEFINE(upcall_flow_limit_kill); > > > > /* A thread that reads upcalls from dpif, forwards each upcall's packet, > > * and possibly sets up a kernel flow as a cache. */ > > @@ -2622,6 +2623,7 @@ revalidate(struct revalidator *revalidator) > > > > long long int max_idle; > > long long int now; > > + size_t kill_all_limit; > > size_t n_dp_flows; > > bool kill_them_all; > > > > @@ -2649,7 +2651,18 @@ revalidate(struct revalidator *revalidator) > > COVERAGE_INC(upcall_flow_limit_hit); > > } > > > > - kill_them_all = n_dp_flows > flow_limit * 2; > > + kill_them_all = false; > > + kill_all_limit = flow_limit * 2; > > + if (OVS_UNLIKELY(n_dp_flows > kill_all_limit)) { > > + static struct vlog_rate_limit rlem = VLOG_RATE_LIMIT_INIT(1, > > 5); > > + > > + VLOG_WARN_RL(&rlem, "Emergency: deleting all flows " > > + "(now: %"PRIuSIZE", max: %"PRIuSIZE")", > > Almost he same concern here as in the other patch. But this message is even > more misleading since it prints doubled dynamically adjusted limit which is > OK to print, but how to use this value in practice if you do not know > internals > of this code? > > Maybe something like "Number of datapath flows (%PRIuSIZE) twice as high as > current dynamic flow limit (%PRIuSIZE). Deleting all flows as an emergency > measure."? > > Another thing is that this message is actually misleading anyway because > revalidators recalculates the number of flows for each batch (50) of dumped > flows (every 100ms, to be honest), so we will not actually flush all the > flows in many cases. We will remove them until situation is normalized. > > So, probably: > "Number of datapath flows (%PRIuSIZE) twice as high as current dynamic flow > limit (%PRIuSIZE). Starting to delete flows unconditionally as an emergency > measure."? > > Also, this message should be printed only once per revalidation cycle, I > guess, > not for every flow batch. > > > + n_dp_flows, kill_all_limit); > > + COVERAGE_INC(upcall_flow_limit_kill); > > + kill_them_all = true; > > + } > > + > > max_idle = n_dp_flows > flow_limit ? 100 : ofproto_max_idle; > > > > udpif->dpif->current_ms = time_msec(); > > >
Sent v2: https://mail.openvswitch.org/pipermail/ovs-dev/2020-September/375647.html -- fbl _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
