On 1/31/24 10:29, Zhangweiwei via discuss wrote:
> Hi,
> 
> I encountered an issue while using OVS bond in balance-tcp mode.
> After performing down and up operations on bond members, the bond
> entry statistics displayed by ovs-appctl bond/show occured overflow.
> In addition to the statistics values issue, this also led to longer
> load balancing time for bond members.
> 
> 1、information:
> ovs version:2.17.2
> bond mode: balance-tcp
> openflow: cookie=0x0, duration=7027.270s, table=0, n_packets=15169077, 
> n_bytes=9334457220, priority=0 actions=NORMAL
> datapath_type: netdev
> 
> 2、ovs-appctl bond/show print:
> 
> [root@localhost zzz]# ovs-appctl bond/show
> ---- eobond ----
> bond_mode: balance-tcp
> bond may use recirculation: yes, Recirc-ID : 1
> bond-hash-basis: 0
> lb_output action: disabled, bond-id: -1
> updelay: 0 ms
> downdelay: 0 ms
> next rebalance: 9673 ms
> lacp_status: negotiated
> lacp_fallback_ab: false
> active-backup primary: <none>
> active member mac: 98:a9:2d:c5:00:69(u0)
> 
> member u0: enabled
>   active member
>   may_enable: true
>   hash 89: 9007199254740413 kB load
>   hash 219: 9007199254740991 kB load
> 
> member u1: enabled
>   may_enable: true
>   hash 141: 9007199254520657 kB load
> 
> 3、analysis:
> 
> After performing down and up operations on bond members, recirc rules are
> changed,bond_entry_account( ) function updates bond entry statistics through
> recirc rules. rule_tx_bytes  <  entry->pr_tx_bytes , so delta occurs overflow.

So, the main issue here seems to be that the statistics on the rule
itself jumps back for some reason.  There were a few patches in the
past year or so that fix several occurrences of similar statistics
issues during flow dumps.  Can you reproduce the issue on the latest
v2.17.8 release?  2.17.2 is fairly old and coent contain many important
bug fixes.  Newer releases like 3.1.2+ also have enhanced logging
around statistics mishaps, so they are easier to debug.

Best regards, Ilya Maximets.

> 
> static void bond_entry_account (struct bond_entry *entry, uint64_t 
> rule_tx_bytes) 
> OVS_REQ_WRLOCK(rwlock) {
>    if (entry->member) {
>         uint64_t delta;
>         delta = rule_tx_bytes - entry->pr_tx_bytes;    // delta occurs 
> overflow
>         entry->tx_bytes += delta;     
>         entry->pr_tx_bytes = rule_tx_bytes;
>     }
> }
> 
> 4、solution
> 
> I try to add last_pr_rule in struct bond_entry to solve this problem. When
> then recirc rule changes, delta = rule_tx_bytes, and entry->tx_bytes += 
> rule_tx_bytes.
> But I’m not sure whether the value of entry->tx_bytes is correct after the 
> modification.
> 
> index ddc96a4..7b14d53 100644
> --- a/openvswitch-2.17.2/ofproto/bond.c
> +++ b/openvswitch-test/ofproto/bond.c
> @@ -71,6 +71,7 @@ struct bond_entry {
>       * 'pr_tx_bytes' is the most recently seen statistics for 'pr_rule', 
> which
>       * is used to determine delta (applied to 'tx_bytes' above.) */
>      struct rule *pr_rule;
> +    struct rule *last_pr_rule;
>      uint64_t pr_tx_bytes OVS_GUARDED_BY(rwlock);
> };
> 
> @@ -990,8 +991,12 @@ bond_entry_account(struct bond_entry *entry, uint64_t 
> rule_tx_bytes)
>      if (entry->member) {
>          uint64_t delta;
> 
> -        delta = rule_tx_bytes - entry->pr_tx_bytes;
> -        entry->tx_bytes += delta;
> +       if (entry->last_pr_rule != entry->pr_rule) {
> +           entry->tx_bytes += rule_tx_bytes;
> +       } else {
> +            delta = rule_tx_bytes - entry->pr_tx_bytes;
> +            entry->tx_bytes += delta;
> +       }
>          entry->pr_tx_bytes = rule_tx_bytes;
>      }
> }
> @@ -1027,6 +1032,7 @@ bond_recirculation_account(struct bond *bond)
>              continue;
>          }
>          bond_entry_account(entry, stats.n_bytes);
> +       entry->last_pr_rule=rule;
>      }
> }

_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to