On 1/31/24 10:29, Zhangweiwei via discuss wrote: > Hi, > > I encountered an issue while using OVS bond in balance-tcp mode. > After performing down and up operations on bond members, the bond > entry statistics displayed by ovs-appctl bond/show occured overflow. > In addition to the statistics values issue, this also led to longer > load balancing time for bond members. > > 1、information: > ovs version:2.17.2 > bond mode: balance-tcp > openflow: cookie=0x0, duration=7027.270s, table=0, n_packets=15169077, > n_bytes=9334457220, priority=0 actions=NORMAL > datapath_type: netdev > > 2、ovs-appctl bond/show print: > > [root@localhost zzz]# ovs-appctl bond/show > ---- eobond ---- > bond_mode: balance-tcp > bond may use recirculation: yes, Recirc-ID : 1 > bond-hash-basis: 0 > lb_output action: disabled, bond-id: -1 > updelay: 0 ms > downdelay: 0 ms > next rebalance: 9673 ms > lacp_status: negotiated > lacp_fallback_ab: false > active-backup primary: <none> > active member mac: 98:a9:2d:c5:00:69(u0) > > member u0: enabled > active member > may_enable: true > hash 89: 9007199254740413 kB load > hash 219: 9007199254740991 kB load > > member u1: enabled > may_enable: true > hash 141: 9007199254520657 kB load > > 3、analysis: > > After performing down and up operations on bond members, recirc rules are > changed,bond_entry_account( ) function updates bond entry statistics through > recirc rules. rule_tx_bytes < entry->pr_tx_bytes , so delta occurs overflow.
So, the main issue here seems to be that the statistics on the rule itself jumps back for some reason. There were a few patches in the past year or so that fix several occurrences of similar statistics issues during flow dumps. Can you reproduce the issue on the latest v2.17.8 release? 2.17.2 is fairly old and coent contain many important bug fixes. Newer releases like 3.1.2+ also have enhanced logging around statistics mishaps, so they are easier to debug. Best regards, Ilya Maximets. > > static void bond_entry_account (struct bond_entry *entry, uint64_t > rule_tx_bytes) > OVS_REQ_WRLOCK(rwlock) { > if (entry->member) { > uint64_t delta; > delta = rule_tx_bytes - entry->pr_tx_bytes; // delta occurs > overflow > entry->tx_bytes += delta; > entry->pr_tx_bytes = rule_tx_bytes; > } > } > > 4、solution > > I try to add last_pr_rule in struct bond_entry to solve this problem. When > then recirc rule changes, delta = rule_tx_bytes, and entry->tx_bytes += > rule_tx_bytes. > But I’m not sure whether the value of entry->tx_bytes is correct after the > modification. > > index ddc96a4..7b14d53 100644 > --- a/openvswitch-2.17.2/ofproto/bond.c > +++ b/openvswitch-test/ofproto/bond.c > @@ -71,6 +71,7 @@ struct bond_entry { > * 'pr_tx_bytes' is the most recently seen statistics for 'pr_rule', > which > * is used to determine delta (applied to 'tx_bytes' above.) */ > struct rule *pr_rule; > + struct rule *last_pr_rule; > uint64_t pr_tx_bytes OVS_GUARDED_BY(rwlock); > }; > > @@ -990,8 +991,12 @@ bond_entry_account(struct bond_entry *entry, uint64_t > rule_tx_bytes) > if (entry->member) { > uint64_t delta; > > - delta = rule_tx_bytes - entry->pr_tx_bytes; > - entry->tx_bytes += delta; > + if (entry->last_pr_rule != entry->pr_rule) { > + entry->tx_bytes += rule_tx_bytes; > + } else { > + delta = rule_tx_bytes - entry->pr_tx_bytes; > + entry->tx_bytes += delta; > + } > entry->pr_tx_bytes = rule_tx_bytes; > } > } > @@ -1027,6 +1032,7 @@ bond_recirculation_account(struct bond *bond) > continue; > } > bond_entry_account(entry, stats.n_bytes); > + entry->last_pr_rule=rule; > } > } _______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss