Hi, Eelco and Ilya, We have observed the similar case in our production environments.
After digging some code, I would like a discussion: In the situation that the ovs revalidator is under high load, i.e. the number of megaflows are too large, and the revalidators would try to delete some megaflows, at the same time, there are traffic keeps generating new megaflows through upcalls. the underlying cmap of megaflows changes due to a lot of deletion and insertion (cmap will change its bucket number, capacity, and re-place the megaflows in different buckets), resulting in a same magaflow might be dumped twice in two revalidator threads in one dump stage. Consider a case: revalidator 1 gets the megaflow A dumped and its related ukey u(A), and decide to kill the megaflow A. After megaflow deletion, the pmd generates a new megaflow B with the same ufid, i.e. the same match and the same actions, and replace the ukey u(A) with u(B). Now revalidator 2 gets the old megaflow A again, and it will find the ukey u(B). u(B) is the new ukey, it's dump_seq is 0, and this leads megaflow A is viewed as megaflow B, and its statistics are contributed into ukey(B). The megaflow A and megaflow B is in essential the same one, the only difference is the statistics value. This mismatch of megaflow and ukey has two side effects: 1) the megaflow A's statistics have been contributed twice, leading to a amplified value of openflow rule's statistics. This is our observation in our environments. 2) revalidator 2 might also decides to kill ukey(B), but now its statistics equals to megaflow A, and thus results in a mismatch value in the email that Eelco observed. To fix the case 1), I think, during the ukey replace, the new ukey generated by upcall should take the dump_seq value of the replaced old key, and avoid the old megaflow being dumped and find the new key. and the case 2) is also fixed. Any thoughts? Eelco Chaudron <[email protected]> 于2022年2月22日周二 16:10写道: > > > On 21 Feb 2022, at 12:29, Eelco Chaudron wrote: > > > On 17 Feb 2022, at 14:10, Ilya Maximets wrote: > > > >> On 1/31/22 11:54, Eelco Chaudron wrote: > >>> Make sure to only update packet and byte counters when valid, > >>> or else this could lead to "temporarily/occasionally" > >>> out-of-sync flow counters. > >> > >> There was already the same patch submitted here: > >> > https://patchwork.ozlabs.org/project/openvswitch/patch/[email protected]/ > >> > >> And I'm still not comfortable with the change, because it seems > >> like it only hides the underlying datapath problem. Do you know > >> why exactly datapath stats become lower than previously reported? > >> > >> If it's some kind of a statistics flush, that will mean that flow > >> statistics will not be updated until new stats will catch up to the > >> old value leading to the flow revalidation and incorrect flow stats > >> anyway. > > > > This was very hard to reproduce, only once out of 20-30 runs if I > remember correctly. > > > > Without the patch, it would sometimes show very high numbers and then > got updated after a while with the correct numbers. > > At least this is what I remember, as I did not take any notes, and my > brain wanted to forget this patchset :) > > > > > > Guess the fix is needed anyway as this behavior was there since day one > in revalidate_ukey(), e79a6c833. > > > > I also see that you got no reply to your comments, so I’ll take another > stab to make sure this patch is really fixing the problem, or fixing a > problem, and hiding another TC problem. > > > >> We fixed incorrect stats on flow modification for dpdk offload provider > >> previously here: > >> > https://patchwork.ozlabs.org/project/openvswitch/patch/[email protected]/ > >> > >> Do we need something similar for tc? > > > > I’ll take a close look at the DPDK patch, once I get my reproducer going. > > Unfortunately, after an afternoon and night or running tests, I can not > replicate the problem with the weird counters :( > > So for now I’ll drop this patch from the set, and hopefully, it will > resurface when I’m integrating the system-traffic tests into the hardware > offload set. > > <SNIP> > > _______________________________________________ > dev mailing list > [email protected] > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > -- hepeng _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
