On 5/17/23 14:50, Balazs Nemeth wrote:
> The only way that stats->{n_packets,n_bytes} would decrease is due to an
> overflow, or if there are bugs in how statistics are handled. In the
> past, there were multiple issues that caused a jump backward. A
> workaround was in place to set the statistics to 0 in that case. When
> this happened while the revalidator was under heavy load, the workaround
> had an unintended side effect where should_revalidate returned false
> causing the flow to be removed because the metric it calculated was
> based on a bogus value. Since many of those bugs have now been
> identified and resolved, there is no need to set the statistics to 0. In
> addition, the (unlikely) overflow still needs to be handled
> appropriately. If an unexpected jump does happen, just log it as a
> warning.
>
> Signed-off-by: Balazs Nemeth <[email protected]>
> ---
> ofproto/ofproto-dpif-upcall.c | 11 +++++------
> 1 file changed, 5 insertions(+), 6 deletions(-)
>
> diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
> index cd57fdbd9..3cf080c75 100644
> --- a/ofproto/ofproto-dpif-upcall.c
> +++ b/ofproto/ofproto-dpif-upcall.c
> @@ -2372,18 +2372,17 @@ revalidate_ukey(struct udpif *udpif, struct udpif_key
> *ukey,
>
> push.used = stats->used;
> push.tcp_flags = stats->tcp_flags;
> - push.n_packets = (stats->n_packets > ukey->stats.n_packets
> - ? stats->n_packets - ukey->stats.n_packets
> - : 0);
> - push.n_bytes = (stats->n_bytes > ukey->stats.n_bytes
> - ? stats->n_bytes - ukey->stats.n_bytes
> - : 0);
> + push.n_packets = stats->n_packets - ukey->stats.n_packets;
> + push.n_bytes = stats->n_bytes - ukey->stats.n_bytes;
>
> if (stats->n_packets < ukey->stats.n_packets &&
> ukey->stats.n_packets < UINT64_THREE_QUARTERS) {
> /* Report cases where the packet counter is lower than the previous
> * instance, but exclude the potential wrapping of an uint64_t. */
> COVERAGE_INC(ukey_invalid_stat_reset);
> +
> + VLOG_WARN("Unexpected jump in packet stats from %"PRIu64" to
> %"PRIu64,
> + stats->n_packets, ukey->stats.n_packets);
Hi, Balazs. We should, probbaly, print out the key and actions,
so we can better understand which flow caused the issue.
We should be able to use odp_flow_key_format() and format_odp_actions()
for this. ukey contains the key.
What do you think?
We should also rate-limit the log just in case.
Best regards, Ilya Maximets.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev